适合人体的封面

斯图尔特·罗素

ALSO BY STUART RUSSELL

类比和归纳中知识的运用 (1989)

The Use of Knowledge in Analogy and Induction (1989)

做正确的事:有限理性研究 (与 Eric Wefald 合著,1991 年)

Do the Right Thing: Studies in Limited Rationality (with Eric Wefald, 1991)

人工智能:一种现代方法 (与 Peter Norvig 合著,1995 年、2003 年、2010 年,2019)

Artificial Intelligence: A Modern Approach (with Peter Norvig, 1995, 2003, 2010, 2019)

书名:《与人类兼容》,副标题:《人工智能与控制问题》,作者:Stuart Russell,出版社:Viking

版本_1

Version_1

献给洛伊、戈登、露西、乔治和艾萨克

For Loy, Gordon, Lucy, George, and Isaac

前言

PREFACE

为什么要写这本书?为什么是现在?

Why This Book? Why Now?

本书讲述了我们理解和创造智能的过去、现在和未来。这很重要,不是因为人工智能正在迅速成为当今社会普遍存在的一个方面,而是因为它是未来的主导技术。世界大国正在意识到这一事实,世界上最大的公司也早就知道了这一点。我们无法准确预测这项技术将如何发展或在什么时间点发展。然而,我们必须为机器在现实世界中决策能力远远超过人类的可能性做好规划。然后呢?

This book is about the past, present, and future of our attempt to understand and create intelligence. This matters, not because AI is rapidly becoming a pervasive aspect of the present but because it is the dominant technology of the future. The world’s great powers are waking up to this fact, and the world’s largest corporations have known it for some time. We cannot predict exactly how the technology will develop or on what timeline. Nevertheless, we must plan for the possibility that machines will far exceed the human capacity for decision making in the real world. What then?

文明所能提供的一切都是人类智慧的产物;获得更高程度的智慧将是人类历史上最大的事件。本书的目的是解释为什么这可能是人类历史上的最后一次事件,以及如何确保它不会成为最后一次事件。

Everything civilization has to offer is the product of our intelligence; gaining access to considerably greater intelligence would be the biggest event in human history. The purpose of the book is to explain why it might be the last event in human history and how to make sure that it is not.

本书概述

Overview of the Book

本书分为三部分。第一部分(第 1 至 3 章)探讨了人类和机器的智能概念。这些内容不需要技术背景,但对于感兴趣的读者,本书还附有四个附录,解释了当今人工智能系统的一些核心概念。第二部分(第 4 至 6 章)讨论了赋予机器智能所带来的一些问题。我特别关注控制问题:保留对比我们更强大的机器的绝对权力。第三部分(第 7 至 10 章)提出了一种思考人工智能的新方法,并确保机器永远对人类有益。本书面向普通读者,但我希望它有助于说服人工智能专家重新思考他们的基本原则。假设。

The book has three parts. The first part (Chapters 1 to 3) explores the idea of intelligence in humans and in machines. The material requires no technical background, but for those who are interested, it is supplemented by four appendices that explain some of the core concepts underlying present-day AI systems. The second part (Chapters 4 to 6) discusses some problems arising from imbuing machines with intelligence. I focus in particular on the problem of control: retaining absolute power over machines that are more powerful than us. The third part (Chapters 7 to 10) suggests a new way to think about AI and to ensure that machines remain beneficial to humans, forever. The book is intended for a general audience but will, I hope, be of value in convincing specialists in artificial intelligence to rethink their fundamental assumptions.

1

1

如果我们成功了

IF WE SUCCEED

很久以前,我的父母住在英国伯明翰大学附近的一所房子里。他们决定搬出这座城市,把房子卖给了英国文学教授戴维·洛奇。洛奇当时已经是一位著名的小说家。我从未见过他,但我决定读几本他的书:《换位》《小世界》。主要人物中有一些虚构的学者,他们从虚构的伯明翰搬到了虚构的加州伯克利。由于我是一个真正的学者,来自真正的伯明翰,刚刚搬到真正的伯克利,所以似乎巧合部有人在告诉我要注意。

A long time ago, my parents lived in Birmingham, England, in a house near the university. They decided to move out of the city and sold the house to David Lodge, a professor of English literature. Lodge was by that time already a well-known novelist. I never met him, but I decided to read some of his books: Changing Places and Small World. Among the principal characters were fictional academics moving from a fictional version of Birmingham to a fictional version of Berkeley, California. As I was an actual academic from the actual Birmingham who had just moved to the actual Berkeley, it seemed that someone in the Department of Coincidences was telling me to pay attention.

《小世界》中的一个场景让我印象深刻:主人公是一位有抱负的文学理论家,他参加了一个重要的国际会议,并向一群领军人物问:“如果每个人都同意你的观点,接下来会发生什么?”这个问题引起了人们的惊愕,因为小组成员更关心智力斗争,而不是确定真相或获得理解。我突然想到,可以向人工智能领域的领军人物提出一个类似的问题:“如果你成功了会怎样?”该领域的目标一直是创造我们或许可以创造出人类水平或超人类的人工智能,但是我们很少或根本没有考虑过如果我们真的创造了人工智能,将会发生什么。

One particular scene from Small World struck me: The protagonist, an aspiring literary theorist, attends a major international conference and asks a panel of leading figures, “What follows if everyone agrees with you?” The question causes consternation, because the panelists had been more concerned with intellectual combat than ascertaining truth or attaining understanding. It occurred to me then that an analogous question could be asked of the leading figures in AI: “What if you succeed?” The field’s goal had always been to create human-level or superhuman AI, but there was little or no consideration of what would happen if we did.

几年后,Peter Norvig 和我开始编写一本新的人工智能教科书,该书的第一版于 1995 年出版。1本书的最后一部分名为“如果我们真的成功了会怎样?”这一部分指出了好结果和坏结果的可能性,但没有得出明确的结论。到 2010 年出版第三版时,许多人终于开始考虑超人人工智能可能不是一件好事的可能性——但这些人大多是局外人,而不是主流人工智能研究人员。到 2013 年,我确信这个问题不仅属于主流,而且可能是人类面临的最重要的问题。

A few years later, Peter Norvig and I began work on a new AI textbook, whose first edition appeared in 1995.1 The book’s final section is titled “What If We Do Succeed?” The section points to the possibility of good and bad outcomes but reaches no firm conclusions. By the time of the third edition in 2010, many people had finally begun to consider the possibility that superhuman AI might not be a good thing—but these people were mostly outsiders rather than mainstream AI researchers. By 2013, I became convinced that the issue not only belonged in the mainstream but was possibly the most important question facing humanity.

2013 年 11 月,我在伦敦南部一家历史悠久的艺术博物馆达利奇美术馆 (Dulwich Picture Gallery) 发表了演讲。听众大多是退休人员,他们都是非科学家,但对知识问题感兴趣,所以我必须发表一场完全非技术性的演讲。这似乎是我第一次在公众面前尝试我的想法的合适场所。在解释了人工智能的含义后,我提名了五位候选人作为“人类未来的最大事件”:

In November 2013, I gave a talk at the Dulwich Picture Gallery, a venerable art museum in south London. The audience consisted mostly of retired people—nonscientists with a general interest in intellectual matters—so I had to give a completely nontechnical talk. It seemed an appropriate venue to try out my ideas in public for the first time. After explaining what AI was about, I nominated five candidates for “biggest event in the future of humanity”:

  1. 我们都会死(小行星撞击、气候灾难、流行病等)。

  2. We all die (asteroid impact, climate catastrophe, pandemic, etc.).

  3. 我们都永远活着(衰老的医学解决方案)。

  4. We all live forever (medical solution to aging).

  5. 我们发明了超光速旅行并征服了宇宙。

  6. We invent faster-than-light travel and conquer the universe.

  7. 一个更高级的外星文明来访我们。

  8. We are visited by a superior alien civilization.

  9. 我们发明了超级智能 AI。

  10. We invent superintelligent AI.

我认为第五个候选者——超级人工智能——将会是赢家,因为它将帮助我们避免物理灾难,实现永生和超光速旅行(如果这些确实可能的话)。它将代表我们文明的一次巨大飞跃——一次中断。超级人工智能的到来在很多方面类似于更高级的外星文明的到来,但更有可能发生。也许最重要的是,与外星人不同,人工智能是我们可以发表意见的东西。

I suggested that the fifth candidate, superintelligent AI, would be the winner, because it would help us avoid physical catastrophes and achieve eternal life and faster-than-light travel, if those were indeed possible. It would represent a huge leap—a discontinuity—in our civilization. The arrival of superintelligent AI is in many ways analogous to the arrival of a superior alien civilization but much more likely to occur. Perhaps most important, AI, unlike aliens, is something over which we have some say.

然后我请听众想象一下,如果我们收到来自外星文明的通知,说他们将在三十到五十年后抵达地球,会发生什么。“混乱”一词不足以形容这种感觉。然而,我们对超级智能 AI 的预期到来的反应是……好吧,令人失望已经足够形容了。(在后来的一次演讲中,我以图1所示的电子邮件交流的形式说明了这一点。)最后,我解释了超级智能 AI 的重要性:“成功将是人类历史上最大的事件……也许是人类历史上的最后一件事。”

Then I asked the audience to imagine what would happen if we received notice from a superior alien civilization that they would arrive on Earth in thirty to fifty years. The word pandemonium doesn’t begin to describe it. Yet our response to the anticipated arrival of superintelligent AI has been . . . well, underwhelming begins to describe it. (In a later talk, I illustrated this in the form of the email exchange shown in figure 1.) Finally, I explained the significance of superintelligent AI as follows: “Success would be the biggest event in human history . . . and perhaps the last event in human history.”

来自:高级外星文明 <sac12@sirius.canismajor.u>

收件人: humanity@UN.org

主题:联系方式

请注意:我们将在 30-50 年后到达

发件人: humanity@UN.org

收件人:高等外星文明 <sac12@sirius.canismajor.u>

主题:外出:回复:联系

Humanity 目前不在办公室。我们回来后会回复您的消息。☺

From: Superior Alien Civilization <sac12@sirius.canismajor.u>

To: humanity@UN.org

Subject: Contact

Be warned: we shall arrive in 30–50 years

From: humanity@UN.org

To: Superior Alien Civilization <sac12@sirius.canismajor.u>

Subject: Out of office: Re: Contact

Humanity is currently out of the office. We will respond to your message when we return.

图 1:这可能不是与外星高等文明首次接触后的电子邮件交流。

FIGURE 1: Probably not the email exchange that would follow the first contact by a superior alien civilization.

几个月后,也就是 2014 年 4 月,我在冰岛参加一个会议时接到了美国全国公共广播电台的电话,询问他们能否采访我关于刚刚在美国上映的电影《超验骇客》。虽然我读过剧情简介和影评,但我没有看这部电影,因为当时我住在巴黎,这部电影要到 6 月份才会在巴黎上映。然而,碰巧的是,我刚从冰岛回家,顺路去了波士顿,以便参加国防部的一次会议。因此,抵达波士顿洛根机场后,我打车去了最近的电影院。我坐在第二排,看着约翰尼·德普饰演的伯克利人工智能教授被反人工智能活动家枪杀,反人工智能活动家担心超级人工智能的出现。我不由自主地在座位上缩了缩身子。(巧合部又打来电话?)在约翰尼·德普饰演的角色死前,他的思想被上传到一台量子超级计算机,并迅速超越了人类的能力,威胁要统治世界。

A few months later, in April 2014, I was at a conference in Iceland and got a call from National Public Radio asking if they could interview me about the movie Transcendence, which had just been released in the United States. Although I had read the plot summaries and reviews, I hadn’t seen it because I was living in Paris at the time, and it would not be released there until June. It so happened, however, that I had just added a detour to Boston on the way home from Iceland, so that I could participate in a Defense Department meeting. So, after arriving at Boston’s Logan Airport, I took a taxi to the nearest theater showing the movie. I sat in the second row and watched as a Berkeley AI professor, played by Johnny Depp, was gunned down by anti-AI activists worried about, yes, superintelligent AI. Involuntarily, I shrank down in my seat. (Another call from the Department of Coincidences?) Before Johnny Depp’s character dies, his mind is uploaded to a quantum supercomputer and quickly outruns human capabilities, threatening to take over the world.

2014 年 4 月 19 日, 《赫芬顿邮报》刊登了一篇关于《超越》的评论,该评论由我与物理学家马克斯·泰格马克、弗兰克·威尔切克和斯蒂芬·霍金共同撰写。评论中引用了我在达利奇演讲中关于人类历史上最大事件的一句话。从那时起,我公开承认,我的研究领域对人类本身构成了潜在风险。

On April 19, 2014, a review of Transcendence, co-authored with physicists Max Tegmark, Frank Wilczek, and Stephen Hawking, appeared in the Huffington Post. It included the sentence from my Dulwich talk about the biggest event in human history. From then on, I would be publicly committed to the view that my own field of research posed a potential risk to my own species.

我们是如何走到这一步的?

How Did We Get Here?

人工智能的根源可以追溯到古代,但它的“正式”开端是在 1956 年。两位年轻的数学家约翰·麦卡锡和马文·明斯基说服了当时已经以信息理论发明者而闻名的克劳德·香农和 IBM 第一台商用计算机的设计师纳撒尼尔·罗切斯特与他们一起组织达特茅斯学院的暑期项目。目标如下:

The roots of AI stretch far back into antiquity, but its “official” beginning was in 1956. Two young mathematicians, John McCarthy and Marvin Minsky, had persuaded Claude Shannon, already famous as the inventor of information theory, and Nathaniel Rochester, the designer of IBM’s first commercial computer, to join them in organizing a summer program at Dartmouth College. The goal was stated as follows:

这项研究将基于这样的猜想进行:学习的每个方面或智力的任何其他特征原则上都可以被如此精确地描述,以至于机器可以模拟它。我们将尝试找到如何让机器使用语言、形成抽象和概念、解决目前只有人类才能解决的各种问题并提高自身能力。我们认为如果一组经过精心挑选的科学家用一个夏天的时间共同研究这个问题,那么就可以在一个或多个问题上取得重大进展。

The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.

不用说,这花费的时间比一个夏天要长得多:我们仍在努力解决所有这些问题。

Needless to say, it took much longer than a summer: we are still working on all these problems.

达特茅斯会议后的头十年左右,人工智能取得了几项重大成就,包括艾伦·罗宾逊的通用逻辑推理算法2和亚瑟·塞缪尔的跳棋程序,该程序自学击败了其创造者。3第一个人工智能泡沫在 20 世纪 60 年代末破灭,当时机器学习和机器翻译方面的早期努力未能达到预期。 1973 年,英国政府委托撰写的一份报告得出结论:“迄今为止,该领域的任何发现都没有产生当时承诺的重大影响。” 4换句话说,机器还不够聪明。

In the first decade or so after the Dartmouth meeting, AI had several major successes, including Alan Robinson’s algorithm for general-purpose logical reasoning2 and Arthur Samuel’s checker-playing program, which taught itself to beat its creator.3 The first AI bubble burst in the late 1960s, when early efforts at machine learning and machine translation failed to live up to expectations. A report commissioned by the UK government in 1973 concluded, “In no part of the field have the discoveries made so far produced the major impact that was then promised.”4 In other words, the machines just weren’t smart enough.

幸运的是,当时十一岁的我并不知道这份报告。两年后,当我收到一台 Sinclair Cambridge 可编程计算器时,我只想让它变得智能。然而,Sinclair 的最大程序大小为 36 次击键,对于人类级别的人工智能来说还不够大。我毫不气馁,进入了伦敦帝国理工学院的巨型 CDC 6600 超级计算机5,编写了一个国际象棋程序——一叠两英尺高的打孔卡。虽然程序不是很好,但没关系。我知道自己想做什么。

My eleven-year-old self was, fortunately, unaware of this report. Two years later, when I was given a Sinclair Cambridge Programmable calculator, I just wanted to make it intelligent. With a maximum program size of thirty-six keystrokes, however, the Sinclair was not quite big enough for human-level AI. Undeterred, I gained access to the giant CDC 6600 supercomputer5 at Imperial College London and wrote a chess program—a stack of punched cards two feet high. It wasn’t very good, but it didn’t matter. I knew what I wanted to do.

到了 20 世纪 80 年代中期,我已成为伯克利大学的教授,由于所谓的专家系统的商业潜力,人工智能正在经历巨大的复兴。第二次人工智能泡沫破裂是因为这些系统被证明无法胜任许多应用它们的任务。同样,机器还不够聪明。人工智能的寒冬随之而来。我在伯克利的人工智能课程目前有超过 900 名学生,而在 1990 年只有 25 名学生。

By the mid-1980s, I had become a professor at Berkeley, and AI was experiencing a huge revival thanks to the commercial potential of so-called expert systems. The second AI bubble burst when these systems proved to be inadequate for many of the tasks to which they were applied. Again, the machines just weren’t smart enough. An AI winter ensued. My own AI course at Berkeley, currently bursting with over nine hundred students, had just twenty-five students in 1990.

人工智能社区吸取了教训:显然,更聪明更好,但我们必须做好功课才能实现这一点。该领域变得更加数学化。它与长期存在的概率、统计和控制理论等学科建立了联系。当今进步的种子是在那个人工智能寒冬中播下的,包括早期对大规模概率推理系统的研究,以及后来被称为深度学习的研究

The AI community learned its lesson: smarter, obviously, was better, but we would have to do our homework to make that happen. The field became far more mathematical. Connections were made to the long-established disciplines of probability, statistics, and control theory. The seeds of today’s progress were sown during that AI winter, including early work on large-scale probabilistic reasoning systems and what later became known as deep learning.

从 2011 年左右开始,深度学习技术开始在语音识别、视觉对象识别和机器翻译方面取得巨大进步,这三个领域都是该领域最重要的未解问题。从某些方面来看,机器现在在这些领域的能力已经赶上甚至超过了人类。2016 年和 2017 年,DeepMind 的 AlphaGo 击败了前世界围棋冠军李世石和现任冠军柯洁,一些专家预测这些事件要到 2097 年才会发生,甚至永远不会发生。6

Beginning around 2011, deep learning techniques began to produce dramatic advances in speech recognition, visual object recognition, and machine translation—three of the most important open problems in the field. By some measures, machines now match or exceed human capabilities in these areas. In 2016 and 2017, DeepMind’s AlphaGo defeated Lee Sedol, former world Go champion, and Ke Jie, the current champion—events that some experts predicted wouldn’t happen until 2097, if ever.6

如今,人工智能几乎每天都会成为媒体头条的报道对象。在大量风险投资的推动下,数以千计的初创公司应运而生。数百万学生参加了在线人工智能和机器学习课程,该领域的专家年薪高达数百万美元。来自风险基金、各国政府和大公司的投资每年高达数百亿美元——过去五年的投入金额超过了该领域整个历史上的投入金额。自动驾驶汽车和智能个人助理等已经在酝酿中的进展很可能在未来十年左右对世界产生重大影响。人工智能的潜在经济和社会效益巨大,为人工智能研究事业创造了巨大的发展势头。

Now AI generates front-page media coverage almost every day. Thousands of start-up companies have been created, fueled by a flood of venture funding. Millions of students have taken online AI and machine learning courses, and experts in the area command salaries in the millions of dollars. Investments flowing from venture funds, national governments, and major corporations are in the tens of billions of dollars annually—more money in the last five years than in the entire previous history of the field. Advances that are already in the pipeline, such as self-driving cars and intelligent personal assistants, are likely to have a substantial impact on the world over the next decade or so. The potential economic and social benefits of AI are vast, creating enormous momentum in the AI research enterprise.

接下来会发生什么?

What Happens Next?

如此快速的发展速度是否意味着我们即将被机器超越?不。在我们拥有具有超人智能的机器之前,我们还必须取得多项突破。

Does this rapid rate of progress mean that we are about to be overtaken by machines? No. There are several breakthroughs that have to happen before we have anything resembling machines with superhuman intelligence.

科学突破是出了名的难以预测。为了了解到底有多难,我们可以回顾一下另一个可能终结文明的领域的历史:核物理学。

Scientific breakthroughs are notoriously hard to predict. To get a sense of just how hard, we can look back at the history of another field with civilization-ending potential: nuclear physics.

20 世纪初期,也许没有哪位核物理学家比质子的发现者和“分裂原子的人”欧内斯特·卢瑟福更杰出(图 2[a])。与他的同事一样,卢瑟福早就意识到原子核储存着巨大的能量;但当时普遍的观点是,开发这种能量来源是不可能的。

In the early years of the twentieth century, perhaps no nuclear physicist was more distinguished than Ernest Rutherford, the discoverer of the proton and the “man who split the atom” (figure 2[a]). Like his colleagues, Rutherford had long been aware that atomic nuclei stored immense amounts of energy; yet the prevailing view was that tapping this source of energy was impossible.

1933 年 9 月 11 日,英国科学促进会于莱斯特举行年会。卢瑟福勋爵在晚间会议上发表讲话。正如他之前多次所做的那样,他给原子能的前景泼了冷水:“任何想从原子转变中寻找动力来源的人都是在胡说八道。”第二天早上,伦敦《泰晤士报》报道了卢瑟福的演讲(图 2[b])。

On September 11, 1933, the British Association for the Advancement of Science held its annual meeting in Leicester. Lord Rutherford addressed the evening session. As he had done several times before, he poured cold water on the prospects for atomic energy: “Anyone who looks for a source of power in the transformation of the atoms is talking moonshine.” Rutherford’s speech was reported in the Times of London the next morning (figure 2[b]).

图 2 :(a) 核物理学家卢瑟福勋爵。(b) 摘自 1933 年 9 月 12 日《泰晤士报》的一篇报道,该报道涉及卢瑟福前一天晚上的演讲。(c) 核物理学家利奥·西拉德。

FIGURE 2: (a) Lord Rutherford, nuclear physicist. (b) Excerpts from a report in the Times of September 12, 1933, concerning a speech given by Rutherford the previous evening. (c) Leo Szilard, nuclear physicist.

利奥·西拉德(Leo Szilard)(图 2[c])是一位刚刚从纳粹德国逃亡的匈牙利物理学家,当时他住在罗素街的帝国酒店。他在伦敦的广场上吃早餐时读了《泰晤士报》的报道。他仔细思考了所读的内容,然后去散步,发明了中子诱发的核链式反应。7释放核能的问题在不到二十四小时内就从不可能变成了基本解决。次年,西拉德为核反应堆申请了一项秘密专利。第一项核武器专利于 1939 年在法国颁发。

Leo Szilard (figure 2[c]), a Hungarian physicist who had recently fled from Nazi Germany, was staying at the Imperial Hotel on Russell Square in London. He read the Times’ report at breakfast. Mulling over what he had read, he went for a walk and invented the neutron-induced nuclear chain reaction.7 The problem of liberating nuclear energy went from impossible to essentially solved in less than twenty-four hours. Szilard filed a secret patent for a nuclear reactor the following year. The first patent for a nuclear weapon was issued in France in 1939.

这个故事的寓意是,押注人类的智慧是愚蠢的,尤其是当我们的未来受到威胁时。在人工智能社区中,一种否定主义正在出现,甚至否认成功实现人工智能长期目标的可能性。这就像一个载着全人类的公交车司机说:“是的,我正在尽我所能地向悬崖驶去,但相信我,在我们到达那里之前,我们的汽油就会耗尽!”

The moral of this story is that betting against human ingenuity is foolhardy, particularly when our future is at stake. Within the AI community, a kind of denialism is emerging, even going as far as denying the possibility of success in achieving the long-term goals of AI. It’s as if a bus driver, with all of humanity as passengers, said, “Yes, I am driving as hard as I can towards a cliff, but trust me, we’ll run out of gas before we get there!”

我并不是说人工智能一定会成功,而且我认为在未来几年内成功的可能性很小。尽管如此,为最终结果做好准备似乎是明智之举。如果一切顺利,这将预示着人类的黄金时代,但我们必须面对这样一个事实:我们计划制造比人类强大得多的实体。我们如何确保它们永远不会控制我们?

I am not saying that success in AI will necessarily happen, and I think it’s quite unlikely that it will happen in the next few years. It seems prudent, nonetheless, to prepare for the eventuality. If all goes well, it would herald a golden age for humanity, but we have to face the fact that we are planning to make entities that are far more powerful than humans. How do we ensure that they never, ever have power over us?

为了稍微了解一下我们正在玩弄的火种,请考虑一下内容选择算法在社交媒体上是如何运作的。它们并不是特别智能,但它们能够影响整个世界,因为它们直接影响数十亿人。通常,此类算法旨在最大化点击率,即用户点击所呈现项目的概率。解决方案只是展示用户喜欢点击的项目,对吗?错了。解决方案是改变用户的偏好,使其变得更加可预测。更可预测的用户可以获得他们可能点击的项目,从而产生更多收入。持有更极端政治观点的人往往更容易预测他们会点击哪些项目。(可能有一类文章顽固的中间派可能会点击,但很难想象这个类别由什么组成。)与任何理性实体一样,算法会学习如何修改其环境的状态(在本例中是用户的思维),以最大化自己的回报。8后果包括法西斯主义的复苏、支撑世界各地民主的社会契约的解体,以及欧盟和北约的潜在终结。对于几行代码来说,这还不错,即使它得到了一些人类的帮助。现在想象一下一个真正智能的算法能做什么。

To get just an inkling of the fire we’re playing with, consider how content-selection algorithms function on social media. They aren’t particularly intelligent, but they are in a position to affect the entire world because they directly influence billions of people. Typically, such algorithms are designed to maximize click-through, that is, the probability that the user clicks on presented items. The solution is simply to present items that the user likes to click on, right? Wrong. The solution is to change the user’s preferences so that they become more predictable. A more predictable user can be fed items that they are likely to click on, thereby generating more revenue. People with more extreme political views tend to be more predictable in which items they will click on. (Possibly there is a category of articles that die-hard centrists are likely to click on, but it’s not easy to imagine what this category consists of.) Like any rational entity, the algorithm learns how to modify the state of its environment—in this case, the user’s mind—in order to maximize its own reward.8 The consequences include the resurgence of fascism, the dissolution of the social contract that underpins democracies around the world, and potentially the end of the European Union and NATO. Not bad for a few lines of code, even if it had a helping hand from some humans. Now imagine what a really intelligent algorithm would be able to do.

哪里出了问题?

What Went Wrong?

人工智能的历史一直被一个口号所驱动:“越智能越好。”我确信这是一个错误——不是因为某种对被取代的模糊恐惧,而是因为我们理解智能本身的方式。

The history of AI has been driven by a single mantra: “The more intelligent the better.” I am convinced that this is a mistake—not because of some vague fear of being superseded but because of the way we have understood intelligence itself.

智力这个概念对于我们人类来说是至关重要的,这就是为什么我们称自己为智人,即“智者”。经过两千多年的自我反省,我们对智力的特征进行了总结,可以归结为以下几点:

The concept of intelligence is central to who we are—that’s why we call ourselves Homo sapiens, or “wise man.” After more than two thousand years of self-examination, we have arrived at a characterization of intelligence that can be boiled down to this:

人类是聪明的,因为我们的行动能够实现我们的目标。

Humans are intelligent to the extent that our actions can be expected to achieve our objectives.

智能的所有其他特征(感知、思考、学习、发明等)都可以通过它们对我们成功行动能力的贡献来理解。从人工智能诞生之初,机器智能的定义就是一样的:

All those other characteristics of intelligence—perceiving, thinking, learning, inventing, and so on—can be understood through their contributions to our ability to act successfully. From the very beginnings of AI, intelligence in machines has been defined in the same way:

机器的智能程度取决于它们的行为能够实现其目标。

Machines are intelligent to the extent that their actions can be expected to achieve their objectives.

因为机器与人类不同,它们没有自己的目标,所以我们给它们设定目标。换句话说,我们制造优化机器,给它们设定目标,然后它们就开始运行。

Because machines, unlike humans, have no objectives of their own, we give them objectives to achieve. In other words, we build optimizing machines, we feed objectives into them, and off they go.

这种通用方法并非人工智能所独有。它贯穿于我们整个社会的技术和数学基础。在控制理论领域,从大型喷气式飞机到胰岛素泵,控制系统的设计工作是最小化成本函数,该函数通常衡量与期望行为的偏差。在经济学领域,机制和政策旨在最大限度地提高个人效用、群体福利和公司利润。9在运筹学中,运筹学解决复杂的物流和制造问题,解决方案最大化了一段时间内的预期回报总和。最后,在统计学中,学习算法旨在最小化预期损失函数,该函数定义了预测误差的成本。

This general approach is not unique to AI. It recurs throughout the technological and mathematical underpinnings of our society. In the field of control theory, which designs control systems for everything from jumbo jets to insulin pumps, the job of the system is to minimize a cost function that typically measures some deviation from a desired behavior. In the field of economics, mechanisms and policies are designed to maximize the utility of individuals, the welfare of groups, and the profit of corporations.9 In operations research, which solves complex logistical and manufacturing problems, a solution maximizes an expected sum of rewards over time. Finally, in statistics, learning algorithms are designed to minimize an expected loss function that defines the cost of making prediction errors.

显然,这个通用方案(我将其称为标准模型)应用广泛且功能强大。不幸的是,我们并不想要这种意义上的智能机器

Evidently, this general scheme—which I will call the standard model—is widespread and extremely powerful. Unfortunately, we don’t want machines that are intelligent in this sense.

1960 年,麻省理工学院的传奇教授、20 世纪中期数学界的领军人物之一诺伯特·维纳 (Norbert Wiener) 指出了标准模型的缺陷。维纳刚刚看到亚瑟·塞缪尔 (Arthur Samuel) 的跳棋程序比其创造者学得好得多。那次经历促使他写了一篇有先见之明但鲜为人知的论文《自动化的一些道德和技术后果》。10以下是他阐述主要观点的方式:

The drawback of the standard model was pointed out in 1960 by Norbert Wiener, a legendary professor at MIT and one of the leading mathematicians of the mid-twentieth century. Wiener had just seen Arthur Samuel’s checker-playing program learn to play checkers far better than its creator. That experience led him to write a prescient but little-known paper, “Some Moral and Technical Consequences of Automation.”10 Here’s how he states the main point:

如果为了达到目的,我们要使用一个我们无法有效干预其运转的机械装置……我们最好确信,赋予该机器的目的是我们真正想要的目的。

If we use, to achieve our purposes, a mechanical agency with whose operation we cannot interfere effectively . . . we had better be quite sure that the purpose put into the machine is the purpose which we really desire.

“赋予机器的目的”正是机器在标准模型中优化的目标。如果我们把错误的如果把目标交给比我们更聪明的机器,机器就会实现目标,而我们则会失败。我之前描述的社交媒体崩溃只是这种现象的预兆,它是由于在全球范围内用相当不智能的算法优化了错误的目标而导致的。在第 5 章中,我阐述了一些更糟糕的结果。

“The purpose put into the machine” is exactly the objective that machines are optimizing in the standard model. If we put the wrong objective into a machine that is more intelligent than us, it will achieve the objective, and we lose. The social-media meltdown I described earlier is just a foretaste of this, resulting from optimizing the wrong objective on a global scale with fairly unintelligent algorithms. In Chapter 5, I spell out some far worse outcomes.

这一切并不令人意外。几千年来,我们都知道如愿以偿的危险。在每个有人实现三个愿望的故事中,第三个愿望总是要推翻前两个愿望。

All this should come as no great surprise. For thousands of years, we have known the perils of getting exactly what you wish for. In every story where someone is granted three wishes, the third wish is always to undo the first two wishes.

总而言之,人类迈向超人智能的步伐似乎势不可挡,但成功也可能毁灭人类。不过,情况并非一无所获。我们必须了解自己错在哪里,然后加以改正。

In summary, it seems that the march towards superhuman intelligence is unstoppable, but success might be the undoing of the human race. Not all is lost, however. We have to understand where we went wrong and then fix it.

我们能修复它吗?

Can We Fix It?

问题就出在人工智能的基本定义上。我们说,机器的智能程度取决于它们的行为是否能够实现目标,但我们没有可靠的方法来确保它们的目标与我们的目标一致。

The problem is right there in the basic definition of AI. We say that machines are intelligent to the extent that their actions can be expected to achieve their objectives, but we have no reliable way to make sure that their objectives are the same as our objectives.

如果我们不让机器追求它们的目标,而是坚持让它们追求我们的目标,结果会怎样?如果可以设计出这样的机器,它不仅会很聪明,还会对人类有益。那么让我们试试这个:

What if, instead of allowing machines to pursue their objectives, we insist that they pursue our objectives? Such a machine, if it could be designed, would be not just intelligent but also beneficial to humans. So let’s try this:

机器的 益处 在于, 它们的 行动能够实现 我们的 目标。

Machines are beneficial to the extent that their actions can be expected to achieve our objectives.

这或许就是我们一直以来应该做的。

This is probably what we should have done all along.

当然,困难的部分在于,我们的目标是我们自己(80 亿人,我们都是各种各样光荣的人),而不是机器。尽管如此,我们还是有可能制造出有利于正是这种感觉。不可避免的是,这些机器会对我们的目标感到不确定——毕竟,我们自己也不确定——但事实证明,这是一个特性,而不是一个缺陷(也就是说,这是好事,不是坏事)。目标的不确定性意味着机器必然会服从人类:它们会请求许可,它们会接受纠正,它们会允许自己被关闭。

The difficult part, of course, is that our objectives are in us (all eight billion of us, in all our glorious variety) and not in the machines. It is, nonetheless, possible to build machines that are beneficial in exactly this sense. Inevitably, these machines will be uncertain about our objectives—after all, we are uncertain about them ourselves—but it turns out that this is a feature, not a bug (that is, a good thing and not a bad thing). Uncertainty about objectives implies that machines will necessarily defer to humans: they will ask permission, they will accept correction, and they will allow themselves to be switched off.

消除机器应该有明确目标的假设意味着我们需要拆除并替换人工智能的部分基础——我们正在尝试做的事情的基本定义。这也意味着重建大量的上层建筑——实际实现人工智能的思想和方法的积累。结果将是人与机器之间的新关系,我希望这种关系能让我们成功度过未来几十年。

Removing the assumption that machines should have a definite objective means that we will need to tear out and replace part of the foundations of artificial intelligence—the basic definitions of what we are trying to do. That also means rebuilding a great deal of the superstructure—the accumulation of ideas and methods for actually doing AI. The result will be a new relationship between humans and machines, one that I hope will enable us to navigate the next few decades successfully.

2

2

人类与机器的智能

INTELLIGENCE IN HUMANS AND MACHINES

当你走到死胡同时,最好回头看看自己走错路的地方。我曾指出,人工智能的标准模型(即机器优化人类提供的固定目标)是一条死胡同。问题不在于我们可能无法很好地构建人工智能系统;而是我们可能太成功了。人工智能成功的定义本身就是错误的。

When you arrive at a dead end, it’s a good idea to retrace your steps and work out where you took a wrong turn. I have argued that the standard model of AI, wherein machines optimize a fixed objective supplied by humans, is a dead end. The problem is not that we might fail to do a good job of building AI systems; it’s that we might succeed too well. The very definition of success in AI is wrong.

所以让我们回溯我们的脚步,回到起点。让我们试着理解智能的概念是如何产生的,以及它是如何应用到机器上的。然后我们就有机会对什么是好的人工智能系统做出更好的定义。

So let’s retrace our steps, all the way to the beginning. Let’s try to understand how our concept of intelligence came about and how it came to be applied to machines. Then we have a chance of coming up with a better definition of what counts as a good AI system.

智力

Intelligence

宇宙是如何运转的?生命是如何开始的?我的钥匙在哪里?这些都是值得思考的基本问题。但是谁在问这些问题?我该如何回答?少数人怎么能物质——几磅粉灰色的布丁,我们称之为大脑——能够感知、理解、预测和操纵一个难以想象的广阔世界吗?不久之后,心灵开始审视自己。

How does the universe work? How did life begin? Where are my keys? These are fundamental questions worthy of thought. But who is asking these questions? How am I answering them? How can a handful of matter—the few pounds of pinkish-gray blancmange we call a brain—perceive, understand, predict, and manipulate a world of unimaginable vastness? Before long, the mind turns to examine itself.

数千年来,我们一直在努力了解我们的思维是如何运作的。最初,我们的目的包括好奇心、自我管理、说服力,以及分析数学论证这一相当务实的目标。然而,每一步解释思维是如何运作的,都是朝着在人工制品中创造思维能力迈出的一步——也就是朝着人工智能迈出的一步。

We have been trying for thousands of years to understand how our minds work. Initially, the purposes included curiosity, self-management, persuasion, and the rather pragmatic goal of analyzing mathematical arguments. Yet every step towards an explanation of how the mind works is also a step towards the creation of the mind’s capabilities in an artifact—that is, a step towards artificial intelligence.

在理解如何创造智能之前,我们有必要先了解智能是什么。答案并不存在于智商测试中,甚至也不存在于图灵测试中,而是存在于我们所感知的、我们想要的和我们做的事情之间的简单关系中。粗略地说,一个实体的智能程度取决于它所做的事情是否可能实现它所想要的,这取决于它所感知到的东西。

Before we can understand how to create intelligence, it helps to understand what it is. The answer is not to be found in IQ tests, or even in Turing tests, but in a simple relationship between what we perceive, what we want, and what we do. Roughly speaking, an entity is intelligent to the extent that what it does is likely to achieve what it wants, given what it has perceived.

进化起源

Evolutionary origins

以大肠杆菌等低等细菌为例。它长有大约六根鞭毛——长长的、毛发状的触手,基部可顺时针或逆时针旋转。(旋转马达本身就是一件了不起的东西,但那是另一个故事。)当大肠杆菌漂浮在液体家园——你的下肠道——中时,它会交替顺时针旋转鞭毛,使其在原地“翻滚”,或逆时针旋转,使鞭毛缠绕在一起形成一种螺旋桨,使细菌沿直线游动。因此,大肠杆菌进行着一种随机游动——游泳、翻滚、游泳、翻滚——这使它能够找到并消耗葡萄糖,而不是呆在原地饿死。

Consider a lowly bacterium, such as E. coli. It is equipped with about half a dozen flagella—long, hairlike tentacles that rotate at the base either clockwise or counterclockwise. (The rotary motor itself is an amazing thing, but that’s another story.) As E. coli floats about in its liquid home—your lower intestine—it alternates between rotating its flagella clockwise, causing it to “tumble” in place, and counterclockwise, causing the flagella to twine together into a kind of propeller so the bacterium swims in a straight line. Thus, E. coli does a sort of random walk—swim, tumble, swim, tumble—that allows it to find and consume glucose rather than staying put and dying of starvation.

如果这就是全部情况,我们就不会说大肠杆菌特别聪明,因为它的行为不会以任何方式依赖于其环境。它不会做出任何决定,只是执行进化在其基因中建立的固定行为。但这并不是全部情况。当大肠杆菌感觉到浓度增加时当葡萄糖浓度降低时,它会游得更久,翻滚得更少,当它感觉到葡萄糖浓度降低时,它会做相反的事情。因此,考虑到它所感知到的(葡萄糖浓度增加),它所做的(游向葡萄糖)很可能实现它想要的(假设更多的葡萄糖)。

If this were the whole story, we wouldn’t say that E. coli is particularly intelligent, because its actions would not depend in any way on its environment. It wouldn’t be making any decisions, just executing a fixed behavior that evolution has built into its genes. But this isn’t the whole story. When E. coli senses an increasing concentration of glucose, it swims longer and tumbles less, and it does the opposite when it senses a decreasing concentration of glucose. So, what it does (swim towards glucose) is likely to achieve what it wants (more glucose, let’s assume), given what it has perceived (an increasing glucose concentration).

你可能在想,“但是进化也把这种机制植入了基因中!这怎么能让它变得聪明呢?”这是一种危险的推理,因为进化也把你大脑的基本设计植入了基因中,想必你不会希望以此为由否认自己的智力。关键在于,进化植入大肠杆菌基因中的机制,就像植入了你的基因一样,是一种细菌行为根据其在环境中所感知到的情况而变化的机制。进化事先并不知道葡萄糖在哪里,也不知道你的钥匙在哪里,所以将找到它们的能力植入生物体是第二好的做法。

Perhaps you are thinking, “But evolution built this into its genes too! How does that make it intelligent?” This is a dangerous line of reasoning, because evolution built the basic design of your brain into your genes too, and presumably you wouldn’t wish to deny your own intelligence on that basis. The point is that what evolution has built into E. coli’s genes, as it has into yours, is a mechanism whereby the bacterium’s behavior varies according to what it perceives in its environment. Evolution doesn’t know, in advance, where the glucose is going to be or where your keys are, so putting the capability to find them into the organism is the next best thing.

现在,大肠杆菌并不是智力上的巨人。据我们所知,它不记得自己去过哪里,所以如果它从 A 到 B 却找不到葡萄糖,它很可能会回到 A。如果我们构建一个环境,让每个有吸引力的葡萄糖梯度只通向一点苯酚(对大肠杆菌来说是一种毒药),细菌就会继续沿着这些梯度前进。它从不学习。它没有大脑,只有一些简单的化学反应来完成工作。

Now, E. coli is no intellectual giant. As far as we know, it doesn’t remember where it has been, so if it goes from A to B and finds no glucose, it’s just as likely to go back to A. If we construct an environment where every attractive glucose gradient leads only to a spot of phenol (which is a poison for E. coli), the bacterium will keep following those gradients. It never learns. It has no brain, just a few simple chemical reactions to do the job.

动作电位的出现是一次重大进步,动作电位是一种电信号传递形式,大约 10 亿年前首次在单细胞生物中进化而来。后来,多细胞生物进化出了称为神经元的特殊细胞,它们使用电动作电位在生物体内快速传输信号(最高可达每秒 120 米或每小时 270 英里)。神经元之间的连接称为突触。突触连接的强度决定了有多少电刺激从一个神经元传递到另一个神经元。通过改变突触连接的强度,动物可以进行学习。1学习带来了巨大的进化优势,因为动物可以适应各种各样的情况。学习也加快了进化本身的速度。

A big step forward occurred with action potentials, which are a form of electrical signaling that first evolved in single-celled organisms around a billion years ago. Later multicellular organisms evolved specialized cells called neurons that use electrical action potentials to carry signals rapidly—up to 120 meters per second, or 270 miles per hour—within the organism. The connections between neurons are called synapses. The strength of the synaptic connection dictates how much electrical excitation passes from one neuron to another. By changing the strength of synaptic connections, animals learn.1 Learning confers a huge evolutionary advantage, because the animal can adapt to a range of circumstances. Learning also speeds up the rate of evolution itself.

最初,神经元被组织成神经网络,遍布整个生物体,用于协调进食、消化等活动,或协调大面积肌肉细胞的定时收缩。水母优雅的推进力就是神经网络的结果。水母根本没有大脑。

Initially, neurons were organized into nerve nets, which are distributed throughout the organism and serve to coordinate activities such as eating and digestion or the timed contraction of muscle cells across a wide area. The graceful propulsion of jellyfish is the result of a nerve net. Jellyfish have no brains at all.

后来,大脑出现了,同时出现的还有眼睛和耳朵等复杂的感觉器官。在水母出现并形成神经网络几亿年后,人类才有了大脑——一千亿 (10 11 ) 个神经元和一千万亿 (10 15 ) 个突触。虽然与电子电路相比,大脑的速度很慢,但每个状态变化几毫秒的“循环时间”与大多数生物过程相比已经很快了。人脑常常被其主人描述为“宇宙中最复杂的物体”,这可能并非事实,但却是我们仍然对大脑的实际运作方式知之甚少的一个很好的借口。虽然我们对神经元和突触的生物化学以及大脑的解剖结构了解很多,但认知层面的神经实现——学习、认知、记忆、推理、计划、决策等——仍然大多是猜测。2(随着我们对人工智能的了解越来越多,或者随着我们开发出更精确的测量大脑活动的工具,这种情况可能会改变。)因此,当人们在媒体上看到某种人工智能技术“就像人脑一样工作”时,人们可能会怀疑这只是某人的猜测或纯属虚构。

Brains came later, along with complex sense organs such as eyes and ears. Several hundred million years after jellyfish emerged with their nerve nets, we humans arrived with our big brains—a hundred billion (1011) neurons and a quadrillion (1015) synapses. While slow compared to electronic circuits, the “cycle time” of a few milliseconds per state change is fast compared to most biological processes. The human brain is often described by its owners as “the most complex object in the universe,” which probably isn’t true but is a good excuse for the fact that we still understand little about how it really works. While we know a great deal about the biochemistry of neurons and synapses and the anatomical structures of the brain, the neural implementation of the cognitive level—learning, knowing, remembering, reasoning, planning, deciding, and so on—is still mostly anyone’s guess.2 (Perhaps that will change as we understand more about AI, or as we develop ever more precise tools for measuring brain activity.) So, when one reads in the media that such-and-such AI technique “works just like the human brain,” one may suspect it’s either just someone’s guess or plain fiction.

在意识领域,我们确实一无所知,所以我什么也不说。人工智能领域没有人致力于让机器有意识,也没有人知道从哪里开始,而且没有任何行为以意识为先决条件。假设我给你一个程序,问你:“这对人类构成威胁吗?”你分析了代码,确实,在运行时,代码将形成并执行一个计划,其结果将是人类的毁灭,就像国际象棋程序将形成并执行一个计划,其结果将是任何面对它的人类的失败。现在假设我告诉你,代码在运行时也创造了一种机器意识。这会改变你的预测?一点也不。这完全没有区别。3你对其行为的预测完全一样,因为预测基于代码的。好莱坞所有关于机器神秘地获得意识并憎恨人类的情节都错过了重点:重要的是能力,而不是意识。

In the area of consciousness, we really do know nothing, so I’m going to say nothing. No one in AI is working on making machines conscious, nor would anyone know where to start, and no behavior has consciousness as a prerequisite. Suppose I give you a program and ask, “Does this present a threat to humanity?” You analyze the code and indeed, when run, the code will form and carry out a plan whose result will be the destruction of the human race, just as a chess program will form and carry out a plan whose result will be the defeat of any human who faces it. Now suppose I tell you that the code, when run, also creates a form of machine consciousness. Will that change your prediction? Not at all. It makes absolutely no difference.3 Your prediction about its behavior is exactly the same, because the prediction is based on the code. All those Hollywood plots about machines mysteriously becoming conscious and hating humans are really missing the point: it’s competence, not consciousness, that matters.

我们开始了解大脑的一个重要的认知方面,即奖励系统。这是一个由多巴胺介导的内部信号系统,将积极和消极刺激与行为联系起来。它的工作原理是由瑞典神经科学家 Nils-Åke Hillarp 和他的合作者在 20 世纪 50 年代末发现的。它促使我们寻找积极刺激,比如甜味食物,以提高多巴胺水平;它使我们避免消极刺激,比如饥饿和疼痛,因为它们会降低多巴胺水平。从某种意义上说,它与大肠杆菌的葡萄糖寻求机制非常相似,但要复杂得多。它自带内置的学习方法,因此我们的行为会随着时间的推移变得更有效地获得奖励。它还允许延迟满足,这样我们就会学会渴望金钱等能提供最终奖励而不是即时奖励的东西。我们理解大脑奖励系统的一个原因是,它类似于人工智能中开发的强化学习方法,对此我们有一个非常坚实的理论。4

There is one important cognitive aspect of the brain that we are beginning to understand—namely, the reward system. This is an internal signaling system, mediated by dopamine, that connects positive and negative stimuli to behavior. Its workings were discovered by the Swedish neuroscientist Nils-Åke Hillarp and his collaborators in the late 1950s. It causes us to seek out positive stimuli, such as sweet-tasting foods, that increase dopamine levels; it makes us avoid negative stimuli, such as hunger and pain, that decrease dopamine levels. In a sense it’s quite similar to E. coli’s glucose-seeking mechanism, but much more complex. It comes with built-in methods for learning, so that our behavior becomes more effective at obtaining reward over time. It also allows for delayed gratification, so that we learn to desire things such as money that provide eventual reward rather than immediate reward. One reason we understand the brain’s reward system is that it resembles the method of reinforcement learning developed in AI, for which we have a very solid theory.4

从进化的角度来看,我们可以将大脑的奖励系统视为一种提高进化适应性的方式,就像大肠杆菌的葡萄糖寻求机制一样。更有效地寻求奖励的生物体(即找到美味的食物、避免痛苦、进行性活动等)更有可能传播其基因。对于生物体来说,决定哪些行为从长远来看最有可能成功传播其基因是极其困难的,因此进化通过提供内置路标为我们提供了更容易的选择。

From an evolutionary point of view, we can think of the brain’s reward system, just like E. coli’s glucose-seeking mechanism, as a way of improving evolutionary fitness. Organisms that are more effective in seeking reward—that is, finding delicious food, avoiding pain, engaging in sexual activity, and so on—are more likely to propagate their genes. It is extraordinarily difficult for an organism to decide what actions are most likely, in the long run, to result in successful propagation of its genes, so evolution has made it easier for us by providing built-in signposts.

然而,这些标志并不完美。有些获得奖励的方法可能会降低一个人基因传播的可能性。例如,吸毒、喝大量含糖饮料碳酸饮料、每天玩十八个小时的电子游戏似乎都对生殖不利。此外,如果你能直接用电刺激你的奖励系统,你可能会不停地自我刺激,直到你死去。5

These signposts are not perfect, however. There are ways to obtain reward that probably reduce the likelihood that one’s genes will propagate. For example, taking drugs, drinking vast quantities of sugary carbonated beverages, and playing video games for eighteen hours a day all seem counterproductive in the reproduction stakes. Moreover, if you were given direct electrical access to your reward system, you would probably self-stimulate without stopping until you died.5

奖励信号与进化适应度的不一致不仅影响孤立的个体。在巴拿马海岸外的一个小岛上生活着一种侏儒三趾树懒,它似乎对红树林中的一种安定类物质上瘾,这种物质可能正在走向灭绝。6因此,如果一个物种找到了一个生态位,它能够以一种适应不良的方式满足其奖励系统,那么整个物种似乎都会消失。

The misalignment of reward signals and evolutionary fitness doesn’t affect only isolated individuals. On a small island off the coast of Panama lives the pygmy three-toed sloth, which appears to be addicted to a Valium-like substance in its diet of red mangrove leaves and may be going extinct.6 Thus, it seems that an entire species can disappear if it finds an ecological niche where it can satisfy its reward system in a maladaptive way.

然而,除非出现这些意外失败,学习在自然环境中最大化回报通常会提高一个人传播基因和在环境变化中生存的机会。

Barring these kinds of accidental failures, however, learning to maximize reward in natural environments will usually improve one’s chances for propagating one’s genes and for surviving environmental changes.

进化加速器

Evolutionary accelerator

学习不仅有利于生存和发展,还能加速进化。这怎么可能呢?毕竟,学习不会改变一个人的 DNA,而进化就是一代又一代地改变 DNA。学习和进化之间的联系最早由美国心理学家詹姆斯·鲍德温7于 1896 年提出,英国动物行为学家康威·劳埃德·摩根8也独立提出了这一观点,但当时并未得到普遍接受。

Learning is good for more than surviving and prospering. It also speeds up evolution. How could this be? After all, learning doesn’t change one’s DNA, and evolution is all about changing DNA over generations. The connection between learning and evolution was proposed in 1896 by the American psychologist James Baldwin7 and independently by the British ethologist Conwy Lloyd Morgan8 but not generally accepted at the time.

鲍德温效应,即现在所知的效应,可以通过想象进化可以选择一种本能生物,其每一个反应都是预先确定的,或者选择一种适应性生物,其学会采取什么行动。现在,为了说明起见,假设最佳本能生物可以编码为六位数字,例如 472116,而对于适应性生物,进化只指定 472***,生物本身必须在其一生中通过学习填写最后三位数字。显然,如果进化只需要选择前三位数字,那么它的工作就容易得多;适应性生物在学习后三位数字时,在一生中就完成了进化需要很多代才能完成的事情。因此,只要适应性生物能够在学习的同时生存下来,那么学习能力似乎就构成了一条进化捷径。计算机模拟表明鲍德温效应是真实存在的。9文化的影响只会加速这一过程,因为有组织的文明在个体生物学习时保护它,并传递个体原本需要自己学习的信息。

The Baldwin effect, as it is now known, can be understood by imagining that evolution has a choice between creating an instinctive organism whose every response is fixed in advance and creating an adaptive organism that learns what actions to take. Now suppose, for the purposes of illustration, that the optimal instinctive organism can be coded as a six-digit number, say, 472116, while in the case of the adaptive organism, evolution specifies only 472*** and the organism itself has to fill in the last three digits by learning during its lifetime. Clearly, if evolution has to worry about choosing only the first three digits, its job is much easier; the adaptive organism, in learning the last three digits, is doing in one lifetime what evolution would have taken many generations to do. So, provided the adaptive organisms can survive while learning, it seems that the capability for learning constitutes an evolutionary shortcut. Computational simulations suggest that the Baldwin effect is real.9 The effects of culture only accelerate the process, because an organized civilization protects the individual organism while it is learning and passes on information that the individual would otherwise need to learn for itself.

鲍德温效应的故事引人入胜,但却不完整:它假设学习和进化必然指向同一方向。也就是说,它假设任何内部反馈信号决定了生物体学习的方向,都与进化适应性完全一致。正如我们在侏儒三趾树懒的案例中所看到的,这似乎并不正确。内置的学习机制充其量只能粗略地暗示任何给定行为对进化适应性的长期影响。此外,人们不得不问:“奖励系统最初是如何实现的?”答案当然是通过进化过程,即内化至少在某种程度上与进化适应性一致的反馈机制的过程。10 显然导致生物逃离潜在配偶并转向捕食者的学习机制不会持续很长时间。

The story of the Baldwin effect is fascinating but incomplete: it assumes that learning and evolution necessarily point in the same direction. That is, it assumes that whatever internal feedback signal defines the direction of learning within the organism is perfectly aligned with evolutionary fitness. As we have seen in the case of the pygmy three-toed sloth, this does not seem to be true. At best, built-in mechanisms for learning provide only a crude hint of the long-term consequences of any given action for evolutionary fitness. Moreover, one has to ask, “How did the reward system get there in the first place?” The answer, of course, is by an evolutionary process, one that internalized a feedback mechanism that is at least somewhat aligned with evolutionary fitness.10 Clearly, a learning mechanism that caused organisms to run away from potential mates and towards predators would not last long.

因此,我们要感谢鲍德温效应,因为神经元具有学习和解决问题的能力,在动物界如此普遍。同时,重要的是要明白,进化并不真正关心你是否有大脑或是否有有趣的想法。进化只把你当作一个代理,也就是一个行动的东西。逻辑推理、有目的的计划、智慧、机智、想象力和创造力等有价值的智力特征可能是使代理变得智能所必需的,也可能不是。人工智能如此迷人的原因之一是它为理解这些问题提供了一条潜在途径:我们可能会理解这些智力特征如何使智能行为成为可能,以及为什么没有它们就不可能产生真正的智能行为。

Thus, we have the Baldwin effect to thank for the fact that neurons, with their capabilities for learning and problem solving, are so widespread in the animal kingdom. At the same time, it is important to understand that evolution doesn’t really care whether you have a brain or think interesting thoughts. Evolution considers you only as an agent, that is, something that acts. Such worthy intellectual characteristics as logical reasoning, purposeful planning, wisdom, wit, imagination, and creativity may be essential for making an agent intelligent, or they may not. One reason artificial intelligence is so fascinating is that it offers a potential route to understanding these issues: we may come to understand both how these intellectual characteristics make intelligent behavior possible and why it’s impossible to produce truly intelligent behavior without them.

理性

Rationality for one

自古希腊哲学诞生之初,“智力”的概念就与感知、推理和成功行动的能力息息相关。11几个世纪以来,这一概念的适用性变得更加广泛,定义也更加精确。

From the earliest beginnings of ancient Greek philosophy, the concept of intelligence has been tied to the ability to perceive, to reason, and to act successfully.11 Over the centuries, the concept has become both broader in its applicability and more precise in its definition.

亚里士多德等人研究了成功推理的概念,即在正确前提下得出正确结论的逻辑推理方法。他还研究了决定如何行动的过程(有时称为实践推理),并提出,这涉及推断某种行动方案将实现预期目标:

Aristotle, among others, studied the notion of successful reasoning—methods of logical deduction that would lead to true conclusions given true premises. He also studied the process of deciding how to act—sometimes called practical reasoning—and proposed that it involved deducing that a certain course of action would achieve a desired goal:

我们考虑的不是目的,而是手段。因为医生不会考虑他是否要治病,演说家也不会考虑他是否要说服别人。……他们假设目的,并考虑如何以及通过什么手段实现目的,以及是否似乎容易且最好地实现目的;而如果目的只通过一种手段实现,他们就会考虑如何通过这种方式实现目的以及通过什么手段实现目的,直到他们找到第一个原因……分析顺序中排在最后的东西在形成顺序中似乎是第一位的。如果我们遇到不可能的事情,我们就会放弃寻找,例如,如果我们需要钱而无法得到钱;但如果某件事看起来是可能的,我们就会尝试去做。12

We deliberate not about ends, but about means. For a doctor does not deliberate whether he shall heal, nor an orator whether he shall persuade. . . . They assume the end and consider how and by what means it is attained, and if it seems easily and best produced thereby; while if it is achieved by one means only they consider how it will be achieved by this and by what means this will be achieved, till they come to the first cause . . . and what is last in the order of analysis seems to be first in the order of becoming. And if we come on an impossibility, we give up the search, e.g., if we need money and this cannot be got; but if a thing appears possible we try to do it.12

有人可能会说,这段话为西方关于理性的思想奠定了接下来两千多年的基调。它说,“目的”——人想要的东西——是固定的、既定的;它说理性行动是指根据一系列行动的逻辑推理,“轻松且最佳地”产生结果的行动。

This passage, one might argue, set the tone for the next two-thousand-odd years of Western thought about rationality. It says that the “end”—what the person wants—is fixed and given; and it says that the rational action is one that, according to logical deduction across a sequence of actions, “easily and best” produces the end.

亚里士多德的建议看似合理,但并非理性行为的完整指南。特别是,它忽略了不确定性问题。在现实世界中,现实往往会干预,很少有行动或行动序列真正保证实现预期结果。例如,当我写这句话时,巴黎正是一个下雨的星期天,星期二下午 2:15,我飞往罗马的航班从戴高乐机场起飞,距离我家约 45 分钟路程。我计划在上午 11:30 左右出发前往机场,这应该有足够的时间,但这可能意味着至少要在出发区坐一个小时。我一定能赶上飞机吗?完全不可能。可能会出现严重的交通堵塞,出租车司机可能会罢工,我乘坐的出租车可能会抛锚,或者司机可能会在高速追逐后被捕,等等。相反,我可以提前一整天在星期一出发去机场。这样可以大大降低误机的几率,但在候机室过夜的前景并不吸引人。换句话说,我的计划涉及成功的确定性和确保这种确定性的成本之间的权衡。以下买房计划涉及类似的权衡:买彩票,赢一百万美元,然后买房子。这个“轻松且最佳”的计划产生了结果,但成功的可能性不大。然而,这个愚蠢的买房计划和我清醒而明智的机场计划之间的区别只是程度的问题。两者都是赌博,但其中一个似乎比另一个更合理。

Aristotle’s proposal seems reasonable, but it isn’t a complete guide to rational behavior. In particular, it omits the issue of uncertainty. In the real world, reality has a tendency to intervene, and few actions or sequences of actions are truly guaranteed to achieve the intended end. For example, it is a rainy Sunday in Paris as I write this sentence, and on Tuesday at 2:15 p.m. my flight to Rome leaves from Charles de Gaulle Airport, about forty-five minutes from my house. I plan to leave for the airport around 11:30 a.m., which should give me plenty of time, but it probably means at least an hour sitting in the departure area. Am I certain to catch the flight? Not at all. There could be huge traffic jams, the taxi drivers may be on strike, the taxi I’m in may break down or the driver may be arrested after a high-speed chase, and so on. Instead, I could leave for the airport on Monday, a whole day in advance. This would greatly reduce the chance of missing the flight, but the prospect of a night in the departure lounge is not an appealing one. In other words, my plan involves a trade-off between the certainty of success and the cost of ensuring that degree of certainty. The following plan for buying a house involves a similar trade-off: buy a lottery ticket, win a million dollars, then buy the house. This plan “easily and best” produces the end, but it’s not very likely to succeed. The difference between this harebrained house-buying plan and my sober and sensible airport plan is, however, just a matter of degree. Both are gambles, but one seems more rational than the other.

事实证明,赌博在推广亚里士多德关于不确定性的提议中发挥了核心作用。1560 年代,意大利数学家杰罗拉莫·卡尔达诺 (Gerolamo Cardano) 开发了第一个数学上精确的概率理论——以骰子游戏为主要例子。(不幸的是,他的著作直到 1663 年才出版。13 17 世纪,包括安托万·阿诺德 (Antoine Arnauld) 和布莱斯·帕斯卡 (Blaise Pascal) 在内的法国思想家开始——出于数学原因——研究赌博中的理性决策问题。14考虑以下两个赌注:

It turns out that gambling played a central role in generalizing Aristotle’s proposal to account for uncertainty. In the 1560s, the Italian mathematician Gerolamo Cardano developed the first mathematically precise theory of probability—using dice games as his main example. (Unfortunately, his work was not published until 1663.13) In the seventeenth century, French thinkers including Antoine Arnauld and Blaise Pascal began—for assuredly mathematical reasons—to study the question of rational decisions in gambling.14 Consider the following two bets:

答:赢得 10 美元的几率为 20%

B:赢得 100 美元的几率为 5%

A: 20 percent chance of winning $10

B: 5 percent chance of winning $100

数学家们提出的建议可能和你想出的一样:比较赌注的预期值,也就是你期望从每次赌注中获得的平均金额。对于赌注 A,预期值为 10 美元的 20%,即 2 美元。对于赌注 B,预期值为 100 美元的 5%,即 5 美元。因此,根据这一理论,赌注 B 更好。这一理论是有道理的,因为如果一遍又一遍地提供相同的赌注,遵守规则的赌徒最终会比不遵守规则的赌徒赚更多的钱。

The proposal the mathematicians came up with is probably the same one you would come up with: compare the expected values of the bets, which means the average amount you would expect to get from each bet. For bet A, the expected value is 20 percent of $10, or $2. For bet B, the expected value is 5 percent of $100, or $5. So bet B is better, according to this theory. The theory makes sense, because if the same bets are offered over and over again, a bettor who follows the rule ends up with more money than one who doesn’t.

十八世纪,瑞士数学家丹尼尔·伯努利注意到这条规则似乎不适用于大额赌注。15例如,考虑以下两个赌注:

In the eighteenth century, the Swiss mathematician Daniel Bernoulli noticed that this rule didn’t seem to work well for larger amounts of money.15 For example, consider the following two bets:

答:获得 10,000,000 美元的机会为 100%

(预期价值 10,000,000 美元)

B:获得 1,000,000,100 美元的机会为 1%

(预期价值 10,000,001 美元)

A: 100 percent chance of getting $10,000,000

(expected value $10,000,000)

B: 1 percent chance of getting $1,000,000,100

(expected value $10,000,001)

本书的大多数读者以及作者都更愿意选择赌注 A 而不是赌注 B,尽管预期价值规则恰恰相反!伯努利认为,赌注的评估不是根据预期货币价值,而是根据预期效用。效用(对人有用或有益的属性)是他认为与货币价值相关但又不同的内部主观量。具体而言,效用相对于货币表现出收益递减。这意味着给定金额的效用并不严格与金额成正比,而是增长得更慢。例如,拥有 1,000,000,100 美元的效用远小于拥有 100 倍的效用。拥有 10,000,000 美元的效用。少多少?你可以问问自己!赢得 10 亿美元的几率有多大,你才会放弃保证赢得的一千万美元?我问过班上的研究生这个问题,他们的回答是 50% 左右,这意味着赌注 B 的预期价值为 5 亿美元,与赌注 A 的可取性相匹配。让我再说一遍:赌注 B 的预期美元价值是赌注 A 的 50 倍,但这两个赌注的效用相同。

Most readers of this book, as well as its author, would prefer bet A to bet B, even though the expected-value rule says the opposite! Bernoulli posited that bets are evaluated not according to expected monetary value but according to expected utility. Utility—the property of being useful or beneficial to a person—was, he suggested, an internal, subjective quantity related to, but distinct from, monetary value. In particular, utility exhibits diminishing returns with respect to money. This means that the utility of a given amount of money is not strictly proportional to the amount but grows more slowly. For example, the utility of having $1,000,000,100 is much less than a hundred times the utility of having $10,000,000. How much less? You can ask yourself! What would the odds of winning a billion dollars have to be for you to give up a guaranteed ten million? I asked this question of the graduate students in my class and their answer was around 50 percent, meaning that bet B would have an expected value of $500 million to match the desirability of bet A. Let me say that again: bet B would have an expected dollar value fifty times greater than bet A, but the two bets would have equal utility.

伯努利引入效用(一种不可见的属性)来通过数学理论解释人类行为,这在当时是一个极其了不起的提议。与金钱数额不同,各种赌注和奖品的效用值不是直接可观察的;相反,效用是从个人表现出的偏好推断出来的。直到两个世纪后,这个想法的含义才被完全阐明,并被统计学家和经济学家广泛接受。

Bernoulli’s introduction of utility—an invisible property—to explain human behavior via a mathematical theory was an utterly remarkable proposal for its time. It was all the more remarkable for the fact that, unlike monetary amounts, the utility values of various bets and prizes are not directly observable; instead, utilities are to be inferred from the preferences exhibited by an individual. It would be two centuries before the implications of the idea were fully worked out and it became broadly accepted by statisticians and economists.

20 世纪中叶,约翰·冯·诺依曼(一位伟大的数学家,计算机的标准“冯·诺依曼体系结构”以他的名字命名16)和奥斯卡·摩根斯坦发表了效用理论的公理基础。17这意味着:只要个人表现出的偏好满足任何理性主体都应满足的某些基本公理,那么该个人所做的选择必然可以被描述为最大化效用函数的预期值。简而言之,理性主体的行为是为了最大化预期效用

In the middle of the twentieth century, John von Neumann (a great mathematician after whom the standard “von Neumann architecture” for computers was named16) and Oskar Morgenstern published an axiomatic basis for utility theory.17 What this means is the following: as long as the preferences exhibited by an individual satisfy certain basic axioms that any rational agent should satisfy, then necessarily the choices made by that individual can be described as maximizing the expected value of a utility function. In short, a rational agent acts so as to maximize expected utility.

这一结论的重要性无论怎样强调都不为过。从很多方面来看,人工智能主要都是研究如何构建理性机器的细节。

It’s hard to overstate the importance of this conclusion. In many ways, artificial intelligence has been mainly about working out the details of how to build rational machines.

让我们更详细地看一下理性实体应该满足的公理。这里有一个公理,称为传递性:如果你喜欢 A 胜过 B,喜欢 B 胜过 C,那么你也会喜欢 A 胜过 C。这似乎很合理!(如果你喜欢香肠披萨胜过普通披萨,喜欢普通披萨胜过菠萝披萨,那么预测以下情况似乎是合理的你会选择香肠披萨而不是菠萝披萨。)还有另一种情况,称为单调性:如果你更喜欢奖品 A 而不是奖品 B,并且你可以选择 A 和 B 是仅有的两种可能结果的彩票,你会更喜欢获得 A 而不是 B 的概率最高的彩票。同样,这也很合理。

Let’s look in a bit more detail at the axioms that rational entities are expected to satisfy. Here’s one, called transitivity: if you prefer A to B and you prefer B to C, then you prefer A to C. This seems pretty reasonable! (If you prefer sausage pizza to plain pizza, and you prefer plain pizza to pineapple pizza, then it seems reasonable to predict that you will choose sausage pizza over pineapple pizza.) Here’s another, called monotonicity: if you prefer prize A to prize B, and you have a choice of lotteries where A and B are the only two possible outcomes, you prefer the lottery with the highest probability of getting A rather than B. Again, pretty reasonable.

偏好不仅仅与披萨和有奖金的彩票有关。偏好可以是任何事情;特别是,它可以与整个未来的生活和他人的生活有关。在处理涉及随时间推移的事件序列的偏好时,通常会做出一个额外的假设,称为平稳性:如果两个不同的未来 A 和 B 以同一事件开始,并且您更喜欢 A 而不是 B,则在事件发生后,您仍然更喜欢 A 而不是 B。这听起来很合理,但它有一个令人惊讶的强烈后果:任何事件序列的效用都是与每个事件相关的奖励的总和(可能随着时间的推移通过某种心理利率折现)。18虽然这种“效用作为奖励的总和”假设很普遍——至少可以追溯到功利主义创始人杰里米·边沁的 18 世纪“享乐主义计算”——但它所基于的平稳性假设并不是理性主体的必要属性。平稳性还排除了一个人的偏好可能随时间而改变的可能性,而我们的经验表明并非如此。

Preferences are not just about pizza and lotteries with monetary prizes. They can be about anything at all; in particular, they can be about entire future lives and the lives of others. When dealing with preferences involving sequences of events over time, there is an additional assumption that is often made, called stationarity: if two different futures A and B begin with the same event, and you prefer A to B, you still prefer A to B after the event has occurred. This sounds reasonable, but it has a surprisingly strong consequence: the utility of any sequence of events is the sum of rewards associated with each event (possibly discounted over time, by a sort of mental interest rate).18 Although this “utility as a sum of rewards” assumption is widespread—going back at least to the eighteenth-century “hedonic calculus” of Jeremy Bentham, the founder of utilitarianism—the stationarity assumption on which it is based is not a necessary property of rational agents. Stationarity also rules out the possibility that one’s preferences might change over time, whereas our experience indicates otherwise.

尽管这些公理合理,并且由此得出的结论也很重要,但自从效用理论广为人知以来,它就一直受到持续不断的反对。有些人鄙视它,认为它把一切都归结为金钱和自私。(尽管该理论源于法国,但一些法国作家嘲笑它“美国化”了)事实上,想要过一种自我否定的生活,只希望减少他人的痛苦,这是完全合理的。利他主义只是意味着在评估任何未来时,将他人的福祉放在重要位置。

Despite the reasonableness of the axioms and the importance of the conclusions that follow from them, utility theory has been subjected to a continual barrage of objections since it first became widely known. Some despise it for supposedly reducing everything to money and selfishness. (The theory was derided as “American” by some French authors,19 even though it has its roots in France.) In fact, it is perfectly rational to want to live a life of self-denial, wishing only to reduce the suffering of others. Altruism simply means placing substantial weight on the well-being of others in evaluating any given future.

另一组反对意见与获取必要的概率和效用值并将它们相乘的困难有关一起计算预期效用。这些反对意见只是混淆了两件不同的事情:选择理性行为和通过计算预期效用选择理性行为。例如,如果你试图用手指戳眼球,你的眼睑会闭上以保护眼睛;这是理性的,但不涉及预期效用的计算。或者假设你骑着一辆没有刹车的自行车下山,可以选择以每小时十英里的速度撞上一堵混凝土墙,或者以每小时二十英里的速度撞上山下另一堵混凝土墙;你会选择哪一个?如果你选择每小时十英里,恭喜你!你计算过预期效用吗?可能没有。但每小时十英里的选择仍然是理性的。这源于两个基本假设:首先,你更喜欢不太严重的伤害而不是更严重的伤害,其次,对于任何给定的伤害程度,增加碰撞速度会增加超过该程度的概率。从这两个假设中,从数学上讲——根本不考虑任何数字——以每小时十英里的速度撞车比以每小时二十英里的速度撞车具有更高的预期效用。20综上所述,最大化预期效用可能不需要计算任何期望或效用。它只是理性实体的纯粹外部描述。

Another set of objections has to do with the difficulty of obtaining the necessary probabilities and utility values and multiplying them together to calculate expected utilities. These objections are simply confusing two different things: choosing the rational action and choosing it by calculating expected utilities. For example, if you try to poke your eyeball with your finger, your eyelid closes to protect your eye; this is rational, but no expected-utility calculations are involved. Or suppose you are riding a bicycle downhill with no brakes and have a choice between crashing into one concrete wall at ten miles per hour or another, farther down the hill, at twenty miles per hour; which would you prefer? If you chose ten miles per hour, congratulations! Did you calculate expected utilities? Probably not. But the choice of ten miles per hour is still rational. This follows from two basic assumptions: first, you prefer less severe injuries to more severe injuries, and second, for any given level of injuries, increasing the speed of collision increases the probability of exceeding that level. From these two assumptions it follows mathematically—without considering any numbers at all—that crashing at ten miles per hour has higher expected utility than crashing at twenty.20 In summary, maximizing expected utility may not require calculating any expectations or any utilities. It’s a purely external description of a rational entity.

对理性理论的另一个批评在于对决策地点的识别。也就是说,什么事物算作主体?人类是主体,这似乎是显而易见的,但家庭、部落、公司、文化和民族国家呢?如果我们研究蚂蚁等社会性昆虫,将一只蚂蚁视为一个智能主体是否有意义,或者智能是否真的存在于整个群体中,由多个蚂蚁大脑和身体组成的复合大脑通过信息素信号而不是电信号相互连接?从进化的角度来看,这可能是一种更有成效的思考蚂蚁的方式,因为特定群体中的蚂蚁通常密切相关。作为个体,蚂蚁和其他社会性昆虫似乎缺乏与群体保护不同的自我保护本能:它们总是会投身于与入侵者的战斗中,即使是在自杀的情况下。然而有时人类也会做同样的事情,即使是为了保护没有血缘关系的人类;就好像物种受益于一些个体的存在,这些个体愿意在战斗中牺牲自己,或者进行疯狂的、投机性的探险之旅,或者养育他人的后代。在这种情况下,完全关注个体的理性分析显然缺少了一些基本的东西。

Another critique of the theory of rationality lies in the identification of the locus of decision making. That is, what things count as agents? It might seem obvious that humans are agents, but what about families, tribes, corporations, cultures, and nation-states? If we examine social insects such as ants, does it make sense to consider a single ant as an intelligent agent, or does the intelligence really lie in the colony as a whole, with a kind of composite brain made up of multiple ant brains and bodies that are interconnected by pheromone signaling instead of electrical signaling? From an evolutionary point of view, this may be a more productive way of thinking about ants, since the ants in a given colony are typically closely related. As individuals, ants and other social insects seem to lack an instinct for self-preservation as distinct from colony preservation: they will always throw themselves into battle against invaders, even at suicidal odds. Yet sometimes humans will do the same even to defend unrelated humans; it is as if the species benefits from the presence of some fraction of individuals who are willing to sacrifice themselves in battle, or to go off on wild, speculative voyages of exploration, or to nurture the offspring of others. In such cases, an analysis of rationality that focuses entirely on the individual is clearly missing something essential.

对效用理论的其他主要反对意见是经验性的——即,它们基于表明人类是非理性的实验证据。我们无法系统地遵循公理。21的目的不是要为效用理论辩护,将其作为人类行为的正式模型。事实上,人类不可能理性地行事。我们的偏好涵盖我们自己未来的生活、我们子孙后代的生活以及现在或将来其他人的生活。然而,我们甚至无法在棋盘上走正确的棋步,棋盘是一个规则明确、视野很短的微小简单的地方。这并不是因为我们的偏好不合理,而是因为决策问题的复杂性。我们的很大一部分认知结构是为了弥补我们小而慢的大脑与我们一直面临的决策问题的难以理解的巨大复杂性之间的不匹配。

The other principal objections to utility theory are empirical—that is, they are based on experimental evidence suggesting that humans are irrational. We fail to conform to the axioms in systematic ways.21 It is not my purpose here to defend utility theory as a formal model of human behavior. Indeed, humans cannot possibly behave rationally. Our preferences extend over the whole of our own future lives, the lives of our children and grandchildren, and the lives of others, living now or in the future. Yet we cannot even play the right moves on the chessboard, a tiny, simple place with well-defined rules and a very short horizon. This is not because our preferences are irrational but because of the complexity of the decision problem. A great deal of our cognitive structure is there to compensate for the mismatch between our small, slow brains and the incomprehensibly huge complexity of the decision problem that we face all the time.

因此,虽然将有益的人工智能理论建立在人类理性的假设之上是相当不合理的,但假设成年人对未来生活的偏好大致一致却是相当合理的。也就是说,如果你能以某种方式观看两部电影,每部电影都足够详细和广泛地描述了你可能的未来生活,以至于每部电影都构成了一次虚拟体验,你可以说你喜欢哪一部,或者表示无所谓。22

So, while it would be quite unreasonable to base a theory of beneficial AI on an assumption that humans are rational, it’s quite reasonable to suppose that an adult human has roughly consistent preferences over future lives. That is, if you were somehow able to watch two movies, each describing in sufficient detail and breadth a future life you might lead, such that each constitutes a virtual experience, you could say which you prefer, or express indifference.22

如果我们的唯一目标是确保足够智能的机器不会给人类带来灾难,那么这种说法可能过于强烈。灾难的概念本身就意味着绝对不喜欢的生活。那么,为了避免灾难,我们只需要声称成年人在详细阐述灾难性未来时能够识别它。当然,人类的偏好有很大的不同。比仅仅“非灾难比灾难好”更细致,大概也是可确定的结构。

This claim is perhaps stronger than necessary, if our only goal is to make sure that sufficiently intelligent machines are not catastrophic for the human race. The very notion of catastrophe entails a definitely-not-preferred life. For catastrophe avoidance, then, we need claim only that adult humans can recognize a catastrophic future when it is spelled out in great detail. Of course, human preferences have a much more fine-grained and, presumably, ascertainable structure than just “non-catastrophes are better than catastrophes.”

事实上,有益的人工智能理论可以适应人类偏好的不一致,但你的偏好中不一致的部分永远无法得到满足,人工智能也无能为力。例如,假设你对披萨的偏好违反了传递性公理:

A theory of beneficial AI can, in fact, accommodate inconsistency in human preferences, but the inconsistent part of your preferences can never be satisfied and there’s nothing AI can do to help. Suppose, for example, that your preferences for pizza violate the axiom of transitivity:

机器人:欢迎回家!想吃菠萝披萨吗?

你:不,你应该知道比起菠萝披萨我更喜欢普通披萨。

机器人:好的,一份普通的披萨上来!

你:不用了,我更喜欢香肠披萨。

机器人:非常抱歉,一份香肠披萨!

您:事实上,比起香肠我更喜欢菠萝。

机器人我错了,那是菠萝!

你:我已经说过我更喜欢原味的,而不是菠萝味的。

ROBOT: Welcome home! Want some pineapple pizza?

YOU: No, you should know I prefer plain pizza to pineapple.

ROBOT: OK, one plain pizza coming up!

YOU: No thanks, I like sausage pizza better.

ROBOT: So sorry, one sausage pizza!

YOU: Actually, I prefer pineapple to sausage.

ROBOT: My mistake, pineapple it is!

YOU: I already said I like plain better than pineapple.

机器人提供的披萨没有一种能让你满意,因为你总有另一种披萨更适合你。机器人只能满足你偏好中一致的部分——例如,假设你喜欢三种披萨,而不是完全不吃披萨。在这种情况下,一个乐于助人的机器人可以给你三种披萨中的任何一种,从而满足你避免“不吃披萨”的偏好,同时让你悠闲地思考你令人讨厌的不一致的披萨配料偏好。

There is no pizza the robot can serve that will make you happy because there’s always another pizza you would prefer to have. A robot can satisfy only the consistent part of your preferences—for example, let’s say you prefer all three kinds of pizza to no pizza at all. In that case, a helpful robot could give you any one of the three pizzas, thereby satisfying your preference to avoid “no pizza” while leaving you to contemplate your annoyingly inconsistent pizza topping preferences at leisure.

两人理性

Rationality for two

理性主体的行为是为了最大化预期效用,这一基本思想非常简单,即使实际操作起来非常复杂。然而,这一理论只适用于单个主体单独行动的情况。对于多个主体,至少在原则上可以分配概率给不同的主体的概念一个人的行为结果的预测会变得很成问题。原因是现在这个世界的一部分——另一个主体——正在试图猜测你将要采取什么行动,反之亦然,因此如何为这个世界的这一部分将如何表现分配概率并不明显。如果没有概率,理性行为的定义——最大化预期效用——就不适用了。

The basic idea that a rational agent acts so as to maximize expected utility is simple enough, even if actually doing it is impossibly complex. The theory applies, however, only in the case of a single agent acting alone. With more than one agent, the notion that it’s possible—at least in principle—to assign probabilities to the different outcomes of one’s actions becomes problematic. The reason is that now there’s a part of the world—the other agent—that is trying to second-guess what action you’re going to do, and vice versa, so it’s not obvious how to assign probabilities to how that part of the world is going to behave. And without probabilities, the definition of rational action as maximizing expected utility isn’t applicable.

一旦有其他人出现,代理就需要其他方式来做出理性决策。这就是博弈论的用武之地。尽管名字叫博弈论,但它并不一定与通常意义上的游戏有关;它是一种将理性概念扩展到多个代理的情况的普遍尝试。这对于我们的目的显然很重要,因为我们还没有计划(目前)制造生活在其他星系无人居住星球上的机器人;我们将把机器人放在我们居住的世界中。

As soon as someone else comes along, then, an agent will need some other way to make rational decisions. This is where game theory comes in. Despite its name, game theory isn’t necessarily about games in the usual sense; it’s a general attempt to extend the notion of rationality to situations with multiple agents. This is obviously important for our purposes, because we aren’t planning (yet) to build robots that live on uninhabited planets in other star systems; we’re going to put the robots in our world, which is inhabited by us.

为了说明为什么我们需要博弈论,让我们看一个简单的例子:爱丽丝和鲍勃在后花园踢足球(图 3)。爱丽丝即将罚点球,鲍勃在守门。爱丽丝将要射门向鲍勃的左边或右边射门。因为爱丽丝是右脚球员,所以向鲍勃的右边射门会更容易一些,也更准确一些。因为爱丽丝的射门很凶猛,鲍勃知道他必须马上向一个方向扑救或另一个方向扑救——他没有时间等着看球会往哪个方向飞。鲍勃可以这样推理:“如果爱丽丝向我的右边射门,她得分的几率更大,因为她是右脚球员,所以她会选择那样,所以我会向右边扑救。”但爱丽丝并不傻,她可以想象鲍勃会这样想,在这种情况下她会向鲍勃的左边射门。但鲍勃也不傻,她可以想象爱丽丝会这样想,在这种情况下他会向他的左边扑救。但爱丽丝并不傻,她可以想象鲍勃会这样想……好吧,你明白了。换句话说:如果爱丽丝有一个理性的选择,鲍勃也可以弄清楚,预测它,并阻止爱丽丝得分,所以这个选择一开始就不可能是理性的。

To make it clear why we need game theory, let’s look at a simple example: Alice and Bob playing soccer in the back garden (figure 3). Alice is about to take a penalty kick and Bob is in goal. Alice is going to shoot to Bob’s left or to his right. Because she is right-footed, it’s a little bit easier and more accurate for Alice to shoot to Bob’s right. Because Alice has a ferocious shot, Bob knows he has to dive one way or the other right away—he won’t have time to wait and see which way the ball is going. Bob could reason like this: “Alice has a better chance of scoring if she shoots to my right, because she’s right-footed, so she’ll choose that, so I’ll dive right.” But Alice is no fool and can imagine Bob thinking this way, in which case she will shoot to Bob’s left. But Bob is no fool and can imagine Alice thinking this way, in which case he will dive to his left. But Alice is no fool and can imagine Bob thinking this way. . . . OK, you get the idea. Put another way: if there is a rational choice for Alice, Bob can figure it out too, anticipate it, and stop Alice from scoring, so the choice couldn’t have been rational in the first place.

图 3:爱丽丝即将对鲍勃罚点球。

FIGURE 3: Alice about to take a penalty kick against Bob.

早在 1713 年——同样是在对赌博游戏的分析中——人们就找到了这个难题的解决方案。23诀窍不在于选择任何一种行动,而是选择一个随机策略。例如,爱丽丝可以选择“以 55% 的概率向鲍勃的右侧射击,以 45% 的概率向鲍勃的左侧射击”的策略。鲍勃可以选择“以 60% 的概率向右俯冲,以 40% 的概率向左俯冲”的策略。在行动之前,每个人都在心里抛出一枚偏差适当的硬币,这样就不会泄露自己的意图。通过采取不可预测的行动,爱丽丝和鲍勃避免了上一段的矛盾。即使鲍勃知道爱丽丝的随机策略是什么,如果没有水晶球,他也无能为力。

As early as 1713—once again, in the analysis of gambling games—a solution was found to this conundrum.23 The trick is not to choose any one action but to choose a randomized strategy. For example, Alice can choose the strategy “shoot to Bob’s right with probability 55 percent and shoot to his left with probability 45 percent.” Bob could choose “dive right with probability 60 percent and left with probability 40 percent.” Each mentally tosses a suitably biased coin just before acting, so they don’t give away their intentions. By acting unpredictably, Alice and Bob avoid the contradictions of the preceding paragraph. Even if Bob works out what Alice’s randomized strategy is, there’s not much he can do about it without a crystal ball.

下一个问题是,概率应该是多少?爱丽丝选择 55%–45% 是否合理?具体数值取决于爱丽丝向鲍勃右侧射击时的准确度,鲍勃向正确方向扑救时的扑救能力,等等​​。(完整分析见注释。24 然而,一般标准非常简单:

The next question is, What should the probabilities be? Is Alice’s choice of 55 percent–45 percent rational? The specific values depend on how much more accurate Alice is when shooting to Bob’s right, how good Bob is at saving the shot when he dives the right way, and so on. (See the notes for the complete analysis.24) The general criterion is very simple, however:

  1. 假设鲍勃的策略是固定的,那么爱丽丝的策略就是她能想出的最佳策略。

  2. Alice’s strategy is the best she can devise, assuming that Bob’s is fixed.

  3. 假设爱丽丝的策略是固定的,那么鲍勃的策略就是他能想出的最佳策略。

  4. Bob’s strategy is the best he can devise, assuming that Alice’s is fixed.

如果两个条件都满足,我们就说这些策略是均衡的。这种均衡被称为纳什均衡,以纪念约翰·纳什。1950 年,22 岁的纳什证明了,无论游戏规则如何,对于任何数量的具有任何理性偏好的主体来说,这种均衡都存在。在与精神分裂症斗争了几十年之后,纳什最终康复,并于 1994 年因这项工作获得了诺贝尔经济学奖。

If both conditions are satisfied, we say that the strategies are in equilibrium. This kind of equilibrium is called a Nash equilibrium in honor of John Nash, who, in 1950 at the age of twenty-two, proved that such an equilibrium exists for any number of agents with any rational preferences and no matter what the rules of the game might be. After several decades’ struggle with schizophrenia, Nash eventually recovered and was awarded the Nobel Memorial Prize in Economics for this work in 1994.

对于爱丽丝和鲍勃的足球比赛来说,只有一个均衡。在其他情况下,可能会有多个均衡,因此,与预期效用决策不同,纳什均衡的概念并不总是能给出关于如何行动的唯一建议。

For Alice and Bob’s soccer game, there is only one equilibrium. In other cases, there may be several, so the concept of Nash equilibria, unlike that of expected-utility decisions, does not always lead to a unique recommendation for how to behave.

更糟糕的是,有些情况下纳什均衡似乎会导致非常不理想的结果。其中一个例子就是著名的囚徒困境,由纳什的博士导师阿尔伯特·塔克于 1950 年命名。25这个博弈是现实世界中那些非常常见的情况的抽象模型,在这种情况下,相互合作对所有相关人员来说都是更好的选择,但人们仍然选择相互毁灭。

Worse still, there are situations in which the Nash equilibrium seems to lead to highly undesirable outcomes. One such case is the famous prisoner’s dilemma, so named by Nash’s PhD adviser, Albert Tucker, in 1950.25 The game is an abstract model of those all-too-common real-world situations where mutual cooperation would be better for all concerned but people nonetheless choose mutual destruction.

囚徒困境的运作方式如下:爱丽丝和鲍勃是犯罪嫌疑人,正在接受单独审讯。每个人都有一个选择:向警方坦白并告发其同伙,或者拒绝承认。26如果两人都拒绝承认,他们将被判定为较轻的罪名并服刑两年;如果两人都坦白,他们将被判定为较重的罪名并服刑十年;如果一人坦白而另一人拒绝承认,坦白的人将被释放,而同伙将被服刑二十年。

The prisoner’s dilemma works as follows: Alice and Bob are suspects in a crime and are being interrogated separately. Each has a choice: to confess to the police and rat on his or her accomplice, or to refuse to talk.26 If both refuse, they are convicted on a lesser charge and serve two years; if both confess, they are convicted on a more serious charge and serve ten years; if one confesses and the other refuses, the one who confesses goes free and the accomplice serves twenty years.

现在,爱丽丝的理由如下:“如果鲍勃要坦白,那我也应该坦白(十年总比二十年好);如果他拒绝,那我也应该坦白(获得自由总比坐两年牢好);所以不管怎样,我都应该坦白。”鲍勃的理由也一样方式。因此,他们最终都承认了自己的罪行并服刑十年,尽管如果共同拒绝,他们本可以只服刑两年。问题是共同拒绝不是纳什均衡,因为每个人都有动机通过坦白而叛逃并获得自由。

Now, Alice reasons as follows: “If Bob is going to confess, then I should confess too (ten years is better than twenty); if he is going to refuse, then I should confess (going free is better than spending two years in prison); so either way, I should confess.” Bob reasons the same way. Thus, they both end up confessing to their crimes and serving ten years, even though by jointly refusing they could have served only two years. The problem is that joint refusal isn’t a Nash equilibrium, because each has an incentive to defect and go free by confessing.

请注意,爱丽丝可以这样推理:“无论我做什么推理,鲍勃也会这样做。所以我们最终会选择同样的事情。既然共同拒绝比共同坦白更好,我们就应该拒绝。”这种推理形式承认,作为理性的主体,爱丽丝和鲍勃会做出相关而非独立的选择。这只是博弈论者为获得不那么令人沮丧的囚徒困境解决方案而尝试的众多方法之一。27

Note that Alice could have reasoned as follows: “Whatever reasoning I do, Bob will also do. So we’ll end up choosing the same thing. Since joint refusal is better than joint confession, we should refuse.” This form of reasoning acknowledges that, as rational agents, Alice and Bob will make choices that are correlated rather than independent. It’s just one of many approaches that game theorists have tried in their efforts to obtain less depressing solutions to the prisoner’s dilemma.27

另一个著名的非预期均衡例子是公地悲剧,1833 年英国经济学家威廉·劳埃德28首次对其进行了分析,但 1968 年生态学家加勒特·哈丁将其命名,并引起了全球的关注。29几个人消费一种共享资源(如公共牧场或鱼类资源)时,就会发生悲剧,而这种资源的补充速度很慢。在没有任何社会或法律约束的情况下,自私(非利他)主体之间唯一的纳什均衡就是每个人都尽可能多地消费,从而导致资源迅速崩溃。理想的解决方案是每个人都共享资源,使总消费可持续,但这不是一个均衡,因为每个人都有作弊的动机,拿走超过自己应得份额的资源,并将成本转嫁给他人。当然,在实践中,人类有时会通过建立配额和惩罚或定价方案等机制来避免这种悲剧。它们之所以能做到这一点,是因为它们不仅限于决定消费多少,还可以决定沟通。通过以这种方式扩大决策问题,我们找到了对每个人都更有利的解决方案。

Another famous example of an undesirable equilibrium is the tragedy of the commons, first analyzed in 1833 by the English economist William Lloyd28 but named, and brought to global attention, by the ecologist Garrett Hardin in 1968.29 The tragedy arises when several people can consume a shared resource—such as common grazing land or fish stocks—that replenishes itself slowly. Absent any social or legal constraints, the only Nash equilibrium among selfish (non-altruistic) agents is for each to consume as much as possible, leading to rapid collapse of the resource. The ideal solution, where everyone shares the resource such that the total consumption is sustainable, is not an equilibrium because each individual has an incentive to cheat and take more than their fair share—imposing the costs on others. In practice, of course, humans do sometimes avoid this tragedy by setting up mechanisms such as quotas and punishments or pricing schemes. They can do this because they are not limited to deciding how much to consume; they can also decide to communicate. By enlarging the decision problem in this way, we find solutions that are better for everyone.

这些例子和许多其他例子都表明,将理性决策理论扩展到多个主体会产生许多有趣而复杂的行为。这也非常重要因为显而易见,人类不止一个。而且很快还会有智能机器。毋庸置疑,我们必须实现相互合作,造福人类,而不是相互毁灭。

These examples, and many others, illustrate the fact that extending the theory of rational decisions to multiple agents produces many interesting and complex behaviors. It’s also extremely important because, as should be obvious, there is more than one human being. And soon there will be intelligent machines too. Needless to say, we have to achieve mutual cooperation, resulting in benefit to humans, rather than mutual destruction.

计算机

Computers

合理定义智能是创造智能机器的首要因素。第二个因素是一台可以实现这一定义的机器。出于很快就会明白的原因,这台机器就是计算机。它可能是其他东西——例如,我们可能试图通过复杂的化学反应或劫持生物细胞30 来制造智能机器——但从最早的机械计算器开始,为计算而制造的设备在发明者看来一直是智能的天然家园。

Having a reasonable definition of intelligence is the first ingredient in creating intelligent machines. The second ingredient is a machine in which that definition can be realized. For reasons that will soon become obvious, that machine is a computer. It could have been something different—for example, we might have tried to make intelligent machines out of complex chemical reactions or by hijacking biological cells30—but devices built for computation, from the very earliest mechanical calculators onwards, have always seemed to their inventors to be the natural home for intelligence.

我们已经习惯了电脑,几乎注意不到它们那令人难以置信的强大功能。如果你有一台笔记本电脑、台式电脑或智能手机,看看它:一个小盒子,可以输入字符。只需打字,你就可以创建程序,让这个盒子变成新的东西,也许是神奇地合成远洋船撞上冰山或外星球上有高个子蓝色人的运动图像;再打几下,它就会把英语翻译成中文;再打几下,它就会听和说;再打几下,它就会打败世界象棋冠军。

We are so used to computers now that we barely notice their utterly incredible powers. If you have a laptop or a desktop or a smart phone, look at it: a small box, with a way to type characters. Just by typing, you can create programs that turn the box into something new, perhaps something that magically synthesizes moving images of oceangoing ships hitting icebergs or alien planets with tall blue people; type some more, and it translates English into Chinese; type some more, and it listens and speaks; type some more, and it defeats the world chess champion.

这种单个盒子能够执行任何你能想象到的过程的能力被称为通用性,这是艾伦·图灵在 1936 年首次提出的概念。31通用性意味着我们不需要单独的机器来进行算术、机器翻译、国际象棋、语音理解或动画:一台机器就能完成所有工作。你的笔记本电脑与世界上最大的 IT 公司运营的庞大服务器群本质上是相同的——即使是那些配备了花哨的专用张量机器学习的处理单元。它与所有尚未发明的未来计算设备基本相同。笔记本电脑可以执行完全相同的任务,只要它有足够的内存;只是需要更长的时间。

This ability of a single box to carry out any process that you can imagine is called universality, a concept first introduced by Alan Turing in 1936.31 Universality means that we do not need separate machines for arithmetic, machine translation, chess, speech understanding, or animation: one machine does it all. Your laptop is essentially identical to the vast server farms run by the world’s largest IT companies—even those equipped with fancy, special-purpose tensor processing units for machine learning. It’s also essentially identical to all future computing devices yet to be invented. The laptop can do exactly the same tasks, provided it has enough memory; it just takes a lot longer.

图灵介绍通用性的论文是有史以来最重要的论文之一。在论文中,他描述了一种简单的计算设备,它可以接受任何其他计算设备的描述以及第二个设备的输入作为输入,并通过模拟第二个设备对其输入的操作,产生与第二个设备相同的输出。我们现在将第一个设备称为通用图灵机。为了证明其通用性,图灵为两种新的数学对象引入了精确的定义:机器和程序。机器和程序共同定义了一系列事件 - 具体而言,是机器及其内存中的一系列状态变化。

Turing’s paper introducing universality was one of the most important ever written. In it, he described a simple computing device that could accept as input the description of any other computing device, together with that second device’s input, and, by simulating the operation of the second device on its input, produce the same output that the second device would have produced. We now call this first device a universal Turing machine. To prove its universality, Turing introduced precise definitions for two new kinds of mathematical objects: machines and programs. Together, the machine and program define a sequence of events—specifically, a sequence of state changes in the machine and its memory.

在数学史上,很少出现新类型的对象。数学始于有记载的历史之初的数字。然后,大约公元前 2000 年,古埃及人和巴比伦人开始研究几何对象(点、线、角、面积等)。中国数学家在公元前一千年引入了矩阵,而集合作为数学对象直到十九世纪才出现。图灵的新对象——机器和程序——可能是有史以来最强大的数学对象。具有讽刺意味的是,数学界在很大程度上未能认识到这一点,从 20 世纪 40 年代开始,计算机和计算一直是大多数主要大学工程系的专长。

In the history of mathematics, new kinds of objects occur quite rarely. Mathematics began with numbers at the dawn of recorded history. Then, around 2000 BCE, ancient Egyptians and Babylonians worked with geometric objects (points, lines, angles, areas, and so on). Chinese mathematicians introduced matrices during the first millennium BCE, while sets as mathematical objects arrived only in the nineteenth century. Turing’s new objects—machines and programs—are perhaps the most powerful mathematical objects ever invented. It is ironic that the field of mathematics largely failed to recognize this, and from the 1940s onwards, computers and computation have been the province of engineering departments in most major universities.

随之而来的领域——计算机科学——在接下来的 70 年里蓬勃发展,产生了大量新概念、设计、方法和应用,以及全球八大最有价值的公司中的七家。

The field that emerged—computer science—exploded over the next seventy years, producing a vast array of new concepts, designs, methods, and applications, as well as seven of the eight most valuable companies in the world.

计算机科学的核心概念是算法,即计算某事物的精确指定方法。算法如今已成为日常生活中熟悉的一部分:平方根袖珍计算器中的算法接收一个数字作为输入,并返回该数字的平方根作为输出;下棋算法接收一个棋局并返回一个移动;路线查找算法接收起点位置、目标位置和街道地图,并返回从起点到目标的最快路线。算法可以用英语或数学符号来描述,但要实现它们,必须用编程语言将其编码为程序。可以使用更简单的算法作为构建块(称为子程序)来构建更复杂的算法- 例如,自动驾驶汽车可能会使用路线查找算法作为子程序,以便知道要去哪里。通过这种方式,可以一层一层地建立起极其复杂的软件系统。

The central concept in computer science is that of an algorithm, which is a precisely specified method for computing something. Algorithms are, by now, familiar parts of everyday life: a square-root algorithm in a pocket calculator receives a number as input and returns the square root of that number as output; a chess-playing algorithm takes a chess position and returns a move; a route-finding algorithm takes a start location, a goal location, and a street map and returns the fastest route from start to goal. Algorithms can be described in English or in mathematical notation, but to be implemented they must be coded as programs in a programming language. More complex algorithms can be built by using simpler ones as building blocks called subroutines—for example, a self-driving car might use a route-finding algorithm as a subroutine so that it knows where to go. In this way, software systems of immense complexity are built up, layer by layer.

计算机硬件很重要,因为速度更快、内存更大的计算机可以让算法运行得更快、处理更多信息。这一领域的进步众所周知,但仍然令人难以置信。第一台商用电子可编程计算机 Ferranti Mark I 每秒可以执行大约一千 (10 3 ) 条指令,主存约为一千字节。截至 2019 年初,最快的计算机是田纳西州橡树岭国家实验室的 Summit 机,每秒执行大约 10 18条指令(快一千万亿倍),内存为 2.5 × 10 17字节(快 250 万亿倍)。这一进步源于电子设备乃至底层物理学的进步,使计算机能够实现令人难以置信的微型化程度。

Computer hardware matters because faster computers with more memory allow algorithms to run more quickly and to handle more information. Progress in this area is well known but still mind-boggling. The first commercial electronic programmable computer, the Ferranti Mark I, could execute about a thousand (103) instructions per second and had about a thousand bytes of main memory. The fastest computer as of early 2019, the Summit machine at the Oak Ridge National Laboratory in Tennessee, executes about 1018 instructions per second (a thousand trillion times faster) and has 2.5 × 1017 bytes of memory (250 trillion times more). This progress has resulted from advances in electronic devices and even in the underlying physics, allowing an incredible degree of miniaturization.

虽然将计算机与大脑进行比较意义不大,但 Summit 的数据略高于人脑的原始容量。如前所述,人脑有大约 10 15 个突触,一个“循环时间”约为百分之一秒,理论上每秒最多可进行 10 17 次“操作”。两者最大的区别在于功耗:Summit 的功耗大约是后者的百万倍。

Although comparisons between computers and brains are not especially meaningful, the numbers for Summit slightly exceed the raw capacity of the human brain, which, as noted previously, has about 1015 synapses and a “cycle time” of about one hundredth of a second, for a theoretical maximum of about 1017 “operations” per second. The biggest difference is power consumption: Summit uses about a million times more power.

摩尔定律是一项经验观察,即芯片上的电子元件数量每两年翻一番,预计该定律将继续有效到 2025 年左右,尽管速度会略慢一些。多年来,速度一直受到硅晶体管快速切换产生的大量热量的限制;此外,电路尺寸无法变得更小,因为电线和连接器(截至 2019 年)宽度不超过 25 个原子,厚度不超过 5 到 10 个原子。2025 年以后,我们将需要使用更多奇特的物理现象——包括负电容装置、32 个单原子晶体管、石墨烯纳米管和光子学——来保持摩尔定律(或其后继定律)的继续。

Moore’s law, an empirical observation that the number of electronic components on a chip doubles every two years, is expected to continue until 2025 or so, although at a slightly slower rate. For some years, speeds have been limited by the large amount of heat generated by the fast switching of silicon transistors; moreover, circuit sizes cannot get much smaller because the wires and connectors are (as of 2019) no more than twenty-five atoms wide and five to ten atoms thick. Beyond 2025, we will need to use more exotic physical phenomena—including negative capacitance devices,32 single-atom transistors, graphene nanotubes, and photonics—to keep Moore’s law (or its successor) going.

除了加速通用计算机,另一种可能性是构建专用设备,这些设备经过定制,仅用于执行一类计算。例如,谷歌的张量处理单元 (TPU) 旨在执行某些机器学习算法所需的计算。一个 TPU pod(2018 版)每秒可执行大约 10 17 次计算 - 几乎与 Summit 机器一样多 - 但功耗却降低了约一百倍,体积也小了一百倍。即使底层芯片技术大致保持不变,这些类型的机器也可以变得越来越大,从而为 AI 系统提供大量原始计算能力。

Instead of just speeding up general-purpose computers, another possibility is to build special-purpose devices that are customized to perform just one class of computations. For example, Google’s tensor processing units (TPUs) are designed to perform the calculations required for certain machine learning algorithms. One TPU pod (2018 version) performs roughly 1017 calculations per second—nearly as much as the Summit machine—but uses about one hundred times less power and is one hundred times smaller. Even if the underlying chip technology remains roughly constant, these kinds of machines can simply be made larger and larger to provide vast quantities of raw computational power for AI systems.

量子计算则是另一回事。它利用量子力学波函数的奇特性质来实现非凡的成果:使用两倍数量的量子硬件,你可以完成两倍以上的计算量!它的工作原理大致如下:33假设你有一个存储量子比特或量子位的微型物理设备。量子位有两种可能的状态,0 和 1。而在经典物理学中,量子比特设备必须处于两种状态之一,而在量子物理学中,携带有关量子位信息的波函数表示它同时处于两种状态。如果你有两个量子位,则有四种可能的联合状态:00、01、10 和 11。如果波函数在两个量子位之间相干纠缠,这意味着没有其他物理过程会搞乱它,那么这两个量子位同时处于所有四种状态。此外,如果将两个量子位连接到执行某些操作的量子电路中,则量子电路将同时处于所有四种状态。计算,则计算会同时进行所有四个状态。如果使用三个量子比特,则可以同时处理八个状态,依此类推。现在,存在一些物理限制,因此完成的工作量小于量子比特数量的指数,34但我们知道,对于一些重要问题,量子计算的效率比任何传统计算机都要高。

Quantum computation is a different kettle of fish. It uses the strange properties of quantum-mechanical wave functions to achieve something remarkable: with twice the amount of quantum hardware, you can do more than twice the amount of computation! Very roughly, it works like this:33 Suppose you have a tiny physical device that stores a quantum bit, or qubit. A qubit has two possible states, 0 and 1. Whereas in classical physics the qubit device has to be in one of the two states, in quantum physics the wave function that carries information about the qubit says that it is in both states simultaneously. If you have two qubits, there are four possible joint states: 00, 01, 10, and 11. If the wave function is coherently entangled across the two qubits, meaning that no other physical processes are there to mess it up, then the two qubits are in all four states simultaneously. Moreover, if the two qubits are connected into a quantum circuit that performs some calculation, then the calculation proceeds with all four states simultaneously. With three qubits, you get eight states processed simultaneously, and so on. Now, there are some physical limitations so that the amount of work that gets done is less than exponential in the number of qubits,34 but we know that there are important problems for which quantum computation is provably more efficient than any classical computer.

截至 2019 年,已有数十个量子比特的小型量子处理器的实验原型投入运行,但量子处理器在执行任何有趣的计算任务时都比传统计算机更快。主要的困难是退相干——热噪声等过程会破坏多量子比特波函数的相干性。量子科学家希望通过引入纠错电路来解决退相干问题,这样计算中出现的任何错误都可以通过某种投票过程快速检测和纠正。不幸的是,纠错系统需要更多的量子比特来完成同样的工作:虽然与现有的传统计算机相比,具有几百个完美量子比特的量子机器非常强大,但我们可能需要几百万个纠错量子比特才能真正实现这些计算。从几十个量子比特到几百万个量子比特需要相当长的时间。如果我们最终实现了这一目标,那将彻底改变我们仅靠蛮力计算所能做的事情的图景。35我们不必等待人工智能的真正概念进步,而是可以利用量子计算的原始力量来绕过当前“非智能”算法面临的一些障碍。

As of 2019, there are experimental prototypes of small quantum processors in operation with a few tens of qubits, but there are no interesting computing tasks for which a quantum processor is faster than a classical computer. The main difficulty is decoherence—processes such as thermal noise that mess up the coherence of the multi-qubit wave function. Quantum scientists hope to solve the decoherence problem by introducing error correction circuitry, so that any error that occurs in the computation is quickly detected and corrected by a kind of voting process. Unfortunately, error-correcting systems require far more qubits to do the same work: while a quantum machine with a few hundred perfect qubits would be very powerful compared to existing classical computers, we will probably need a few million error-correcting qubits to actually realize those computations. Going from a few tens to a few million qubits will take quite a few years. If, eventually, we get there, that would completely change the picture of what we can do by sheer brute-force computation.35 Rather than waiting for real conceptual advances in AI, we might be able to use the raw power of quantum computation to bypass some of the barriers faced by current “unintelligent” algorithms.

计算的极限

The limits of computation

甚至在 20 世纪 50 年代,大众媒体就将计算机描述为“比爱因斯坦还快”的“超级大脑”。那么,我们现在终于可以说计算机和人脑一样强大了吗?不能。只关注原始的计算能力完全偏离了重点。单看速度不会给我们带来人工智能。在更快的计算机上运行设计不良的算法不会使算法变得更好;这只会意味着你会更快地得到错误的答案。(数据越多,错误答案的机会就越多!)更快的机器的主要作用是缩短实验时间,以便研究可以更快地取得进展。阻碍人工智能发展的不是硬件,而是软件。我们还不知道如何让机器真正智能化——即使它有宇宙那么大。

Even in the 1950s, computers were described in the popular press as “super-brains” that were “faster than Einstein.” So can we say now, finally, that computers are as powerful as the human brain? No. Focusing on raw computing power misses the point entirely. Speed alone won’t give us AI. Running a poorly designed algorithm on a faster computer doesn’t make the algorithm better; it just means you get the wrong answer more quickly. (And with more data there are more opportunities for wrong answers!) The principal effect of faster machines has been to make the time for experimentation shorter, so that research can progress more quickly. It’s not hardware that is holding AI back; it’s software. We don’t yet know how to make a machine really intelligent—even if it were the size of the universe.

但是,假设我们确实设法开发了正确的 AI 软件。物理学对计算机的强大程度有任何限制吗?这些限制会阻止我们拥有足够的计算能力来创建真正的 AI 吗?答案似乎是肯定的,存在限制,并且不,这些限制不会阻止我们创建真正的 AI。麻省理工学院物理学家 Seth Lloyd 根据量子理论和熵的考虑,估算了笔记本电脑大小的计算机的极限。36这些数字甚至会让卡尔·萨根感到惊讶:每秒10 51 次操作和 10 30字节内存,或者比 Summit 快大约十亿万亿倍,内存多四万亿倍——如前所述,Summit 的原始能力比人脑更强。因此,当有人认为人类思维代表了我们宇宙物理可实现的上限时,37人们至少应该要求进一步澄清。

Suppose, however, that we do manage to develop the right kind of AI software. Are there any limits placed by physics on how powerful a computer can be? Will those limits prevent us from having enough computing power to create real AI? The answers seem to be yes, there are limits, and no, there isn’t a ghost of a chance that the limits will prevent us from creating real AI. MIT physicist Seth Lloyd has estimated the limits for a laptop-sized computer, based on considerations from quantum theory and entropy.36 The numbers would raise even Carl Sagan’s eyebrows: 1051 operations per second and 1030 bytes of memory, or approximately a billion trillion trillion times faster and four trillion times more memory than Summit—which, as noted previously, has more raw power than the human brain. Thus, when one hears suggestions that the human mind represents an upper limit on what is physically achievable in our universe,37 one should at least ask for further clarification.

除了物理学的限制之外,计算机能力还受到计算机科学家工作的其他限制。图灵本人证明,有些问题无法由任何计算机解决:问题定义明确,有答案,但不可能存在总能找到答案的算法。他举了后来被称为停机问题的例子:算法能否确定给定程序是否存在阻止其完成的“无限循环”?38

Besides limits imposed by physics, there are other limits on the abilities of computers that originate in the work of computer scientists. Turing himself proved that some problems are undecidable by any computer: the problem is well defined, there is an answer, but there cannot exist an algorithm that always finds that answer. He gave the example of what became known as the halting problem: Can an algorithm decide if a given program has an “infinite loop” that prevents it from ever finishing?38

图灵证明没有算法可以解决停机问题39,这对于数学的基础非常重要,但似乎这与计算机是否具有智能无关。这种说法的一个原因是,同样的基本限制似乎也适用于人脑。一旦你开始要求人脑进行精确的自我模拟,模拟自我,模拟自我,等等,你肯定会遇到困难。就我而言,我从未担心过自己无法做到这一点。

Turing’s proof that no algorithm can solve the halting problem39 is incredibly important for the foundations of mathematics, but it seems to have no bearing on the issue of whether computers can be intelligent. One reason for this claim is that the same basic limitation seems to apply to the human brain. Once you start asking a human brain to perform an exact simulation of itself simulating itself simulating itself, and so on, you’re bound to run into difficulties. I, for one, have never worried about my inability to do this.

因此,专注于可判定问题似乎不会对人工智能造成任何实际限制。然而,事实证明,可判定并不意味着容易。计算机科学家花费大量时间思考问题的复杂性,即用最有效的方法解决问题需要多少计算量的问题。这是一个简单的问题:给定一千个数字的列表,找出最大的数字。如果检查每个数字需要一秒钟,那么通过依次检查每个数字并跟踪最大值的明显方法解决这个问题需要一千秒钟。有没有更快的方法?没有,因为如果一种方法没有检查列表中的某个数字,那么这个数字可能是最大的,这种方法就会失败。所以,找到最大元素的时间与列表的大小成正比。计算机科学家会说这个问题具有线性复杂性,这意味着它非常简单;然后她会寻找更有趣的东西来研究。

Focusing on decidable problems, then, seems not to place any real restrictions on AI. It turns out, however, that decidable doesn’t mean easy. Computer scientists spend a lot of time thinking about the complexity of problems, that is, the question of how much computation is needed to solve a problem by the most efficient method. Here’s an easy problem: given a list of a thousand numbers, find the biggest number. If it takes one second to check each number, then it takes a thousand seconds to solve this problem by the obvious method of checking each in turn and keeping track of the biggest. Is there a faster method? No, because if a method didn’t check some number in the list, that number might be the biggest, and the method would fail. So, the time to find the largest element is proportional to the size of the list. A computer scientist would say the problem has linear complexity, meaning that it’s very easy; then she would look for something more interesting to work on.

理论计算机科学家兴奋的是,许多问题在最坏情况下似乎具有指数级的复杂性。这意味着两件事:首先,我们所知的所有算法都需要指数级时间(即输入量呈指数级增长的时间量)来解决至少一些问题实例;其次,理论计算机科学家非常肯定不存在更有效的算法。

What gets theoretical computer scientists excited is the fact that many problems appear40 to have exponential complexity in the worst case. This means two things: first, all the algorithms we know about require exponential time—that is, an amount of time exponential in the size of the input—to solve at least some problem instances; second, theoretical computer scientists are pretty sure that more efficient algorithms do not exist.

难度呈指数增长意味着问题在理论上可能是可解的(即它们肯定是可判定的),但在实践中有时却无法解决;我们称此类问题为难解。一个例子是确定给定地图是否可以仅用三种颜色着色,使得没有两个相邻区域具有相同颜色的问题。(众所周知,用四种不同的颜色着色总是可行的。)对于一百万个区域,可能有些情况(不是全部,但有些情况)需要大约 2 1000 个计算步骤才能找到答案,这意味着在 Summit 超级计算机上需要大约 10 275年,而在 Seth Lloyd 的终极物理笔记本电脑上则只需要 10 242年。宇宙的年龄约为 10 10年,与此相比微不足道。

Exponential growth in difficulty means that problems may be solvable in theory (that is, they are certainly decidable) but sometimes unsolvable in practice; we call such problems intractable. An example is the problem of deciding whether a given map can be colored with just three colors, so that no two adjacent regions have the same color. (It is well known that coloring with four different colors is always possible.) With a million regions, it may be that there are some cases (not all, but some) that require something like 21000 computational steps to find the answer, which means about 10275 years on the Summit supercomputer or a mere 10242 years on Seth Lloyd’s ultimate-physics laptop. The age of the universe, about 1010 years, is a tiny blip compared to this.

棘手问题的存在是否给了我们任何理由认为计算机不可能像人类一样聪明?不。也没有理由认为人类可以解决棘手问题。量子计算有一点帮助(无论是在机器还是大脑中),但不足以改变基本结论。

Does the existence of intractable problems give us any reason to think that computers cannot be as intelligent as humans? No. There is no reason to suppose that humans can solve intractable problems either. Quantum computation helps a bit (whether in machines or brains), but not enough to change the basic conclusion.

复杂性意味着现实世界的决策问题——决定生命中每个时刻现在做什么的问题——非常困难,以至于人类和计算机都无法找到完美的解决方案。

Complexity means that the real-world decision problem—the problem of deciding what to do right now, at every instant in one’s life—is so difficult that neither humans nor computers will ever come close to finding perfect solutions.

这会带来两个后果:首先,我们预计,大多数时候,现实世界的决策充其量只能算是还算过得去,而且肯定远非最优;其次,我们预计,人类和计算机的大部分思维结构(即其决策过程的实际运作方式)将被设计为尽可能克服复杂性,也就是说,尽管世界极其复杂,但仍有可能找到还算过得去的答案。最后,我们预计,无论未来的机器多么智能和强大,前两个后果仍将存在。机器可能比我们更有能力,但它仍远非完全理性。

This has two consequences: first, we expect that, most of the time, real-world decisions will be at best halfway decent and certainly far from optimal; second, we expect that a great deal of the mental architecture of humans and computers—the way their decision processes actually operate—will be designed to overcome complexity to the extent possible—that is, to make it possible to find even halfway decent answers despite the overwhelming complexity of the world. Finally, we expect that the first two consequences will remain true no matter how intelligent and powerful some future machine may be. The machine may be far more capable than us, but it will still be far from perfectly rational.

智能计算机

Intelligent Computers

亚里士多德等人对逻辑学的发展,为理性思维提供了精确的规则,但我们不知道亚里士多德是否从来没有考虑过机器实现这些规则的可能性。在十三世纪,颇具影响力的加泰罗尼亚哲学家、诱惑者和神秘主义者拉蒙·卢尔(Ramon Llull)更接近于此:他实际上制作了刻有符号的纸轮,通过这些纸轮他可以生成断言的逻辑组合。伟大的十七世纪法国数学家布莱斯·帕斯卡(Blaise Pascal)是第一个开发出真正实用的机械计算器的人。尽管它只能加减运算,主要用于他父亲的税务办公室,但它促使帕斯卡写道:“算术机器产生的效果比动物的所有行为更接近思考。”

The development of logic by Aristotle and others made available precise rules for rational thought, but we do not know whether Aristotle ever contemplated the possibility of machines that implemented these rules. In the thirteenth century, the influential Catalan philosopher, seducer, and mystic Ramon Llull came much closer: he actually made paper wheels inscribed with symbols, by means of which he could generate logical combinations of assertions. The great seventeenth-century French mathematician Blaise Pascal was the first to develop a real and practical mechanical calculator. Although it could only add and subtract and was used mainly in his father’s tax-collecting office, it led Pascal to write, “The arithmetical machine produces effects which appear nearer to thought than all the actions of animals.”

19 世纪,英国数学家兼发明家查尔斯·巴贝奇设计了分析机,这台可编程的通用机器后来由图灵定义,技术由此飞跃发展。浪漫诗人兼冒险家拜伦勋爵的女儿洛夫莱斯伯爵夫人艾达帮助了巴贝奇完成这项工作。巴贝奇希望使用分析机计算精确的数学和天文表格,而洛夫莱斯则了解它的真正潜力,41 1842 年将其描述为“一台思考或……推理机器”,可以推理“宇宙中的所有事物”。因此,创建人工智能的基本概念要素已经具备!从那时起,人工智能的出现肯定只是时间问题。……

Technology took a dramatic leap forward in the nineteenth century when the British mathematician and inventor Charles Babbage designed the Analytical Engine, a programmable universal machine in the sense defined later by Turing. He was helped in his work by Ada, Countess of Lovelace, daughter of the romantic poet and adventurer Lord Byron. Whereas Babbage hoped to use the Analytical Engine to compute accurate mathematical and astronomical tables, Lovelace understood its true potential,41 describing it in 1842 as “a thinking or . . . a reasoning machine” that could reason about “all subjects in the universe.” So, the basic conceptual elements for creating AI were in place! From that point, surely, AI would be just a matter of time. . . .

不幸的是,很长一段时间后,分析机从未被制造出来,洛夫莱斯的想法也被人们遗忘。随着图灵 1936 年的理论工作和随后第二次世界大战的推动,通用计算机终于在 20 世纪 40 年代实现。关于创造智能的想法随即而来。图灵 1950 年的论文《计算机器和智能》42是许多关于智能机器可能性的早期著作中最著名的。怀疑论者已经断言,对于几乎任何你能想到的 X,机器永远无法做到 X,而图灵驳斥了这些断言。他还提出了一种智能操作测试,称为模仿游戏,后来(以简化形式)被称为图灵测试。该测试测量机器的行为机器——具体来说,就是它能够欺骗人类审讯者,使其认为它是人类。

A long time, unfortunately—the Analytical Engine was never built, and Lovelace’s ideas were largely forgotten. With Turing’s theoretical work in 1936 and the subsequent impetus of World War II, universal computing machines were finally realized in the 1940s. Thoughts about creating intelligence followed immediately. Turing’s 1950 paper, “Computing Machinery and Intelligence,”42 is the best known of many early works on the possibility of intelligent machines. Skeptics were already asserting that machines would never be able to do X, for almost any X you could think of, and Turing refuted those assertions. He also proposed an operational test for intelligence, called the imitation game, which subsequently (in simplified form) became known as the Turing test. The test measures the behavior of the machine—specifically, its ability to fool a human interrogator into thinking that it is human.

模仿游戏在图灵的论文中扮演着一个特殊的角色——即作为一个思想实验,用来转移那些认为机器不能以正确的方式、出于正确的理由、以正确的意识思考的怀疑论者的注意力。图灵希望将争论转向机器是否能够以某种方式行事的问题;如果机器能够——比如说,能够明智地讨论莎士比亚的十四行诗及其含义——那么对人工智能的怀疑就无法真正持续下去。与普遍的解释相反,我怀疑这个测试并不是对智能的真正定义,即机器只有通过图灵测试才是智能的。事实上,图灵写道:“机器难道不能做一些应该被描述为思考的事情,但这与人类所做的非常不同吗?”另一个不将测试视为人工智能定义的原因是,这是一个糟糕的定义。出于这个原因,主流人工智能研究人员几乎没有花费任何精力来通过图灵测试。

The imitation game serves a specific role in Turing’s paper—namely as a thought experiment to deflect skeptics who supposed that machines could not think in the right way, for the right reasons, with the right kind of awareness. Turing hoped to redirect the argument towards the issue of whether a machine could behave in a certain way; and if it did—if it was able, say, to discourse sensibly on Shakespeare’s sonnets and their meanings—then skepticism about AI could not really be sustained. Contrary to common interpretations, I doubt that the test was intended as a true definition of intelligence, in the sense that a machine is intelligent if and only if it passes the Turing test. Indeed, Turing wrote, “May not machines carry out something which ought to be described as thinking but which is very different from what a man does?” Another reason not to view the test as a definition for AI is that it’s a terrible definition to work with. And for that reason, mainstream AI researchers have expended almost no effort to pass the Turing test.

图灵测试对人工智能没有用,因为它是一个非正式且高度偶然的定义:它取决于人类思维极其复杂且未知的特征,这些特征既来自生物学,也来自文化。没有办法“解开”这个定义,然后从中反推,创造出能够证明通过测试的机器。相反,人工智能专注于理性行为,正如前面描述的:机器的智能程度取决于它所做的事情是否可能实现它所期望的,这取决于它所感知到的东西。

The Turing test is not useful for AI because it’s an informal and highly contingent definition: it depends on the enormously complicated and largely unknown characteristics of the human mind, which derive from both biology and culture. There is no way to “unpack” the definition and work back from it to create machines that will provably pass the test. Instead, AI has focused on rational behavior, just as described previously: a machine is intelligent to the extent that what it does is likely to achieve what it wants, given what it has perceived.

最初,像亚里士多德一样,人工智能研究人员确定了“它想要什么”,并设定了目标,这个目标要么得到满足,要么得不到满足。这些目标可能是在玩具世界中,比如 15 拼图,其目标是将所有编号的方块按从 1 到 15 的顺序排列在一个小的(模拟)方形托盘中;或者它们可能是在真实的物理环境中:20 世纪 70 年代初,加州 SRI 的 Shakey 机器人正在将大块积木推入所需的配置,而爱丁堡大学的 Freddy 正在用零件组装一艘木船。所有这些工作都是使用逻辑问题解决器和规划系统来构建和执行保证实现目标的计划来完成的。43

Initially, like Aristotle, AI researchers identified “what it wants” with a goal that is either satisfied or not. These goals could be in toy worlds like the 15-puzzle, where the goal is to get all the numbered tiles lined up in order from 1 to 15 in a little (simulated) square tray; or they might be in real, physical environments: in the early 1970s, the Shakey robot at SRI in California was pushing large blocks into desired configurations, and Freddy at the University of Edinburgh was assembling a wooden boat from its component pieces. All this work was done using logical problem-solvers and planning systems to construct and execute guaranteed plans to achieve goals.43

到了 20 世纪 80 年代,人们清楚地认识到,单靠逻辑推理是不够的,因为如前所述,没有任何计划可以保证让你到达机场。逻辑需要确定性,而现实世界根本无法提供确定性。与此同时,以色列裔美国计算机科学家 Judea Pearl(2011 年图灵奖得主)一直在研究基于概率论的不确定推理方法。44人工智能研究人员逐渐接受了 Pearl 的想法;他们采用了概率论和效用理论的工具,从而将人工智能与统计学、控制论、经济学和运筹学等其他领域联系起来。这一变化标志着一些观察家所说的现代人工智能的开始。

By the 1980s, it was clear that logical reasoning alone could not suffice, because, as noted previously, there is no plan that is guaranteed to get you to the airport. Logic requires certainty, and the real world simply doesn’t provide it. Meanwhile, the Israeli-American computer scientist Judea Pearl, who went on to win the 2011 Turing Award, had been working on methods for uncertain reasoning based in probability theory.44 AI researchers gradually accepted Pearl’s ideas; they adopted the tools of probability theory and utility theory and thereby connected AI to other fields such as statistics, control theory, economics, and operations research. This change marked the beginning of what some observers call modern AI.

代理和环境

Agents and environments

现代人工智能的核心概念是智能代理——能够感知和行动的东西。代理是一个随时间发生的过程,即感知输入流被转换成动作流。例如,假设所讨论的代理是一辆载我去机场的自动驾驶出租车。它的输入可能包括以每秒 30 帧的速度运行的 8 个 RGB 摄像头;每帧可能包含 750 万像素,每个像素在三个颜色通道中都有一个图像强度值,总计每秒超过 5 GB。(视网膜中 2 亿个感光细胞的数据流甚至更大,这部分解释了为什么视觉占据了人脑的很大一部分。)出租车还每秒从加速度计获取 100 次数据,以及 GPS 数据。数十亿个晶体管(或神经元)的庞大计算能力将大量原始数据转化为平稳、称职的驾驶行为。出租车的操作包括每秒 20 次向方向盘、刹车和油门发送的电子信号。(对于经验丰富的人类驾驶员来说,这种混乱的活动大部分是无意识的:你可能只意识到做出诸如“超越这辆慢速卡车”或“停下来加油”等决定,但你的眼睛、大脑、神经和肌肉仍在做其他所有事情。)对于国际象棋程序来说,输入主要是时钟滴答声,偶尔通知对手的举动和新的棋盘状态,而动作主要是在程序思考时什么也不做,偶尔选择一步并通知对手。对于个人数字助理或 PDA,例如 Siri 或 Cortana,输入不仅包括来自麦克风的声音信号(每秒采样 48,000 次)和来自触摸屏的输入,还包括它访问的每个网页的内容,而动作包括说话和在屏幕上显示材料。

The central concept of modern AI is the intelligent agent—something that perceives and acts. The agent is a process occurring over time, in the sense that a stream of perceptual inputs is converted into a stream of actions. For example, suppose the agent in question is a self-driving taxi taking me to the airport. Its inputs might include eight RGB cameras operating at thirty frames per second; each frame consists of perhaps 7.5 million pixels, each with an image intensity value in each of three color channels, for a total of more than five gigabytes per second. (The flow of data from the two hundred million photoreceptors in the retina is even larger, which partially explains why vision occupies such a large fraction of the human brain.) The taxi also gets data from an accelerometer one hundred times per second, as well as GPS data. This incredible flood of raw data is transformed by the simply gargantuan computing power of billions of transistors (or neurons) into smooth, competent driving behavior. The taxi’s actions include the electronic signals sent to the steering wheel, brakes, and accelerator, twenty times per second. (For an experienced human driver, most of this maelstrom of activity is unconscious: you may be aware only of making decisions such as “overtake this slow truck” or “stop for gas,” but your eyes, brain, nerves, and muscles are still doing all the other stuff.) For a chess program, the inputs are mostly just the clock ticks, with the occasional notification of the opponent’s move and the new board state, while the actions are mostly doing nothing while the program is thinking, and occasionally choosing a move and notifying the opponent. For a personal digital assistant, or PDA, such as Siri or Cortana, the inputs include not just the acoustic signal from the microphone (sampled forty-eight thousand times per second) and input from the touch screen but also the content of each Web page that it accesses, while the actions include both speaking and displaying material on the screen.

我们构建智能代理的方式取决于我们面临的问题的性质。而这又取决于三件事:第一,代理所处环境的性质——棋盘与拥挤的高速公路或手机截然不同;第二,将代理与环境联系起来的观察和行动——例如,Siri 可能会或可能不会访问手机的摄像头,以便它能够看到;第三,代理的目标——教会对手下更好的棋与赢得比赛是截然不同的任务。

The way we build intelligent agents depends on the nature of the problem we face. This, in turn, depends on three things: first, the nature of the environment the agent will operate in—a chessboard is a very different place from a crowded freeway or a mobile phone; second, the observations and actions that connect the agent to the environment—for example, Siri might or might not have access to the phone’s camera so that it can see; and third, the agent’s objective—teaching the opponent to play better chess is a very different task from winning the game.

举一个例子来说明代理的设计如何取决于这些因素:如果目标是赢得比赛,那么国际象棋程序只需要考虑当前的棋盘状态,而不需要任何过去事件的记忆。45另一方面,国际象棋导师应该不断更新其模型,了解学生对国际象棋的哪些方面了解或不了解,以便提供有用的建议。换句话说,对于国际象棋导师来说,学生的思想是环境的相关部分。而且,与棋盘不同,它是环境中不可直接观察的一部分。

To give just one example of how the design of the agent depends on these things: If the objective is to win the game, a chess program need consider only the current board state and does not need any memory of past events.45 The chess tutor, on the other hand, should continually update its model of which aspects of chess the pupil does or does not understand so that it can provide useful advice. In other words, for the chess tutor, the pupil’s mind is a relevant part of the environment. Moreover, unlike the board, it is a part of the environment that is not directly observable.

影响代理设计的问题特征至少包括以下几点:46

The characteristics of problems that influence the design of agents include at least the following:46

  • 环境是否是完全可观察的(例如在国际象棋中,输入可以直接访问环境当前状态的所有相关方面)或部分可观察的(例如在驾驶中,人的视野有限,车辆不透明,而其他驾驶员的意图难以捉摸);

  • whether the environment is fully observable (as in chess, where the inputs provide direct access to all the relevant aspects of the current state of the environment) or partially observable (as in driving, where one’s field of view is limited, vehicles are opaque, and other drivers’ intentions are mysterious);

  • 环境和动作是否是离散的(如在国际象棋中)还是有效连续的(如在驾驶中);

  • whether the environment and actions are discrete (as in chess) or effectively continuous (as in driving);

  • 环境中是否包含其他代理(如在国际象棋和驾驶中)(如在地图上寻找最短路线);

  • whether the environment contains other agents (as in chess and driving) or not (as in finding the shortest routes on a map);

  • 由环境的“规则”或“物理”所规定的动作的结果是否可预测(如在国际象棋中)或不可预测(如在交通和天气中),以及这些规则是否已知或未知;

  • whether the outcomes of actions, as specified by the “rules” or “physics” of the environment, are predictable (as in chess) or unpredictable (as in traffic and weather), and whether those rules are known or unknown;

  • 环境是否动态变化,以至于做出决策的时间受到严格限制(如在驾驶中)或不受严格限制(如在税收策略优化中);

  • whether the environment is dynamically changing, so that the time to make decisions is tightly constrained (as in driving) or not (as in tax strategy optimization);

  • 根据目标衡量决策质量的时间范围长度——这可能非常短(例如紧急制动),也可能是中等长度(例如国际象棋,一局游戏持续大约一百步),或者很长(例如开车送我去机场,如果出租车每秒要做出一百次决策,那么这可能需要数十万个决策周期)。

  • the length of the horizon over which decision quality is measured according to the objective—this may be very short (as in emergency braking), of intermediate duration (as in chess, where a game lasts up to about one hundred moves), or very long (as in driving me to the airport, which might take hundreds of thousands of decision cycles if the taxi is deciding one hundred times per second).

可以想象,这些特征导致了各种令人眼花缭乱的问题类型。只需将上面列出的选项相乘,就可以得到 192 种类型。我们可以找到所有类型的实际问题实例。有些类型通常在 AI 以外的领域进行研究——例如,设计一种保持平飞的自动驾驶仪是一种短期研究,控制理论领域通常研究的连续、动态问题。

As one can imagine, these characteristics give rise to a bewildering variety of problem types. Just multiplying the choices listed above gives 192 types. One can find real-world problem instances for all the types. Some types are typically studied in areas outside AI—for example, designing an autopilot that maintains level flight is a short-horizon, continuous, dynamic problem that is usually studied in the field of control theory.

显然,有些问题类型比其他问题类型更容易解决。人工智能在棋盘游戏和谜题等可观察、离散、确定性且规则已知的问题上取得了很大进展。对于较容易的问题类型,人工智能研究人员已经开发出相当通用和有效的算法,并具有扎实的理论理解;机器在这类问题上的表现往往超过人类。我们可以说一种算法是通用的,因为我们有数学证明,证明它在一类问题上以合理的计算复杂度给出最佳或接近最佳的结果,并且它在实践中很好地解决了这类问题,而不需要任何针对特定问题的修改。

Obviously some problem types are easier than others. AI has made a lot of progress on problems such as board games and puzzles that are observable, discrete, deterministic, and have known rules. For the easier problem types, AI researchers have developed fairly general and effective algorithms and a solid theoretical understanding; often, machines exceed human performance on these kinds of problems. We can tell that an algorithm is general because we have mathematical proofs that it gives optimal or near-optimal results with reasonable computational complexity across an entire class of problems, and because it works well in practice on those kinds of problems without needing any problem-specific modifications.

星际争霸等电子游戏比棋盘游戏难得多:它们涉及数百个活动部件和数千个步骤的时间范围,并且在任何给定时间点,棋盘都只有部分可见。在每个点,玩家可能至少有 10 50步的选择,围棋只有大约 10 2步。47另一方面,规则是已知的,世界是离散的,只有少数几种类型的物体。截至 2019 年初,机器已经和一些专业的星际争霸玩家一样优秀,但还没有准备好挑战最优秀的人类。48重要的是,要达到这一点需要付出大量针对特定问题的努力;通用方法还没有为星际争霸做好准备。

Video games such as StarCraft are quite a bit harder than board games: they involve hundreds of moving parts and time horizons of thousands of steps, and the board is only partially visible at any given time. At each point, a player might have a choice of at least 1050 moves, compared to about 102 in Go.47 On the other hand, the rules are known and the world is discrete with only a few types of objects. As of early 2019, machines are as good as some professional StarCraft players but not yet ready to challenge the very best humans.48 More important, it took a fair amount of problem-specific effort to reach that point; general-purpose methods are not quite ready for StarCraft.

政府管理或分子生物学教学等问题要困难得多。它们具有复杂且几乎不可观察的环境(整个国家的状态或学生的心理状态)、更多的对象和对象类型、对行为没有明确的定义、大多数未知规则、大量的不确定性和非常长的时间尺度。我们有想法和现成的工具来分别解决这些特征,但目前还没有同时应对所有特征的通用方法。当我们为这些类型的任务构建人工智能系统时,它们往往会需要大量针对特定问题的工程并且通常非常脆弱。

Problems such as running a government or teaching molecular biology are much harder. They have complex, mostly unobservable environments (the state of a whole country, or the state of a student’s mind), far more objects and types of objects, no clear definition of what the actions are, mostly unknown rules, a great deal of uncertainty, and very long time scales. We have ideas and off-the-shelf tools that address each of these characteristics separately but, as yet, no general methods that cope with all the characteristics simultaneously. When we build AI systems for these kinds of tasks, they tend to require a great deal of problem-specific engineering and are often very brittle.

当我们设计出能够有效解决特定类型中较难问题的方法,或者设计出需要更少、更弱假设的方法,从而适用于更多问题时,通用性就会取得进展。通用人工智能是一种适用于所有问题类型的方法,在做出很少假设的情况下,能够有效地处理大型和困难的情况。这是人工智能研究的最终目标:一个不需要针对特定​​问题的工程,只需被要求教授分子生物学课程或管理政府的系统。它会从所有可用资源中学习它需要学习的内容,在必要时提出问题,并开始制定和执行有效的计划。

Progress towards generality occurs when we devise methods that are effective for harder problems within a given type or methods that require fewer and weaker assumptions so they are applicable to more problems. General-purpose AI would be a method that is applicable across all problem types and works effectively for large and difficult instances while making very few assumptions. That’s the ultimate goal of AI research: a system that needs no problem-specific engineering and can simply be asked to teach a molecular biology class or run a government. It would learn what it needs to learn from all the available resources, ask questions when necessary, and begin formulating and executing plans that work.

这种通用方法目前尚不存在,但我们正在接近它。也许令人惊讶的是,通用人工智能的很多进展都来自与构建可怕的通用人工智能系统无关的研究。它来自对工具人工智能狭义人工智能的研究,即为特定问题(如下围棋或识别手写数字)而设计的良好、安全、无趣的人工智能系统。人们通常认为,对这种人工智能的研究没有风险,因为它是针对特定问题的,与通用人工智能无关。

Such a general-purpose method does not yet exist, but we are moving closer. Perhaps surprisingly, a lot of this progress towards general AI results from research that isn’t about building scary, general-purpose AI systems. It comes from research on tool AI or narrow AI, meaning nice, safe, boring AI systems designed for particular problems such as playing Go or recognizing handwritten digits. Research on this kind of AI is often thought to present no risk because it’s problem-specific and nothing to do with general-purpose AI.

这种信念源于对这些系统所需要做的工作的误解。事实上,工具人工智能的研究可以而且经常会推动通用人工智能的发展,尤其是当研究者品味高雅,能够攻克当前通用方法无法解决的问题时。在这里,品味高雅意味着解决方案不仅仅是对某个聪明人在这样或那样的情况下会做什么的临时编码,而是试图让机器能够自己找出解决方案。

This belief results from a misunderstanding of what kind of work goes into these systems. In fact, research on tool AI can and often does produce progress towards general-purpose AI, particularly when it is done by researchers with good taste attacking problems that are beyond the capabilities of current general methods. Here, good taste means that the solution approach is not merely an ad hoc encoding of what an intelligent person would do in such-and-such situation but an attempt to provide the machine with the ability to figure out the solution for itself.

例如,当谷歌 DeepMind 的 AlphaGo 团队成功创建出世界一流的围棋程序时,他们并没有真正研究围棋。我的意思是,他们没有写他们编写了大量围棋专用代码,告诉人们在不同类型的围棋情况下该做什么。他们没有设计只适用于围棋的决策程序。相反,他们改进了两种相当通用的技术——前瞻搜索用于决策,强化学习用于学习如何评估位置——以便它们足够有效,可以在超人水平上下围棋。这些改进适用于许多其他问题,包括机器人技术等问题。更进一步的是,AlphaGo 的一个版本 AlphaZero 最近学会了在围棋比赛中击败 AlphaGo,还击败了 Stockfish(世界上最好的国际象棋程序,远胜于任何人类)和 Elmo(世界上最好的将棋程序,也比任何人类都好)。AlphaZero 在一天内完成了所有这些工作。49

For example, when the AlphaGo team at Google DeepMind succeeded in creating their world-beating Go program, they did this without really working on Go. What I mean by this is that they didn’t write a whole lot of Go-specific code saying what to do in different kinds of Go situations. They didn’t design decision procedures that work only for Go. Instead, they made improvements to two fairly general-purpose techniques—lookahead search to make decisions and reinforcement learning to learn how to evaluate positions—so that they were sufficiently effective to play Go at a superhuman level. Those improvements are applicable to many other problems, including problems as far afield as robotics. Just to rub it in, a version of AlphaGo called AlphaZero recently learned to trounce AlphaGo at Go, and also to trounce Stockfish (the world’s best chess program, far better than any human) and Elmo (the world’s best shogi program, also better than any human). AlphaZero did all this in one day.49

20 世纪 90 年代,通用人工智能在手写数字识别研究方面也取得了实质性进展。AT&T 实验室的 Yann LeCun 团队并没有编写特殊算法通过搜索曲线和环来识别“8”,而是改进了现有的神经网络学习算法,生成了卷积神经网络。这些网络在对标记示例进行适当训练后,表现出有效的字符识别能力。相同的算法可以学习识别字母、形状、停车标志、狗、猫和警车。在“深度学习”的名义下,它们彻底改变了语音识别和视觉对象识别。它们也是 AlphaZero 以及当前大多数自动驾驶汽车项目的关键组件之一。

There was also substantial progress towards general-purpose AI in research on recognizing handwritten digits in the 1990s. Yann LeCun’s team at AT&T Labs didn’t write special algorithms to recognize “8” by searching for curvy lines and loops; instead, they improved on existing neural network learning algorithms to produce convolutional neural networks. Those networks, in turn, exhibited effective character recognition after suitable training on labeled examples. The same algorithms can learn to recognize letters, shapes, stop signs, dogs, cats, and police cars. Under the headline of “deep learning,” they have revolutionized speech recognition and visual object recognition. They are also one of the key components in AlphaZero as well as in most of the current self-driving car projects.

如果你仔细想想,通用人工智能的进展将发生在解决特定任务的狭义人工智能项目中,这并不奇怪;这些任务让人工智能研究人员有东西可以深入研究。(人们不说“盯着窗外看是发明之母”是有原因的。)同时,了解已经取得了多少进展以及界限在哪里也很重要。当 AlphaGo 击败李世石以及后来所有其他顶级围棋选手时,许多人认为,因为机器从零开始学习,所以能够在一项众所周知非常困难的任务上击败人类即使对于高智商的人类来说,这也是末日的开始——人工智能接管世界只是时间问题。当 AlphaZero 在国际象棋、将棋和围棋上获胜时,甚至一些怀疑论者可能也相信了这一点。但 AlphaZero 有严格的限制:它只适用于离散、可观察、规则已知的双人游戏。这种方法根本不适用于驾驶、教学、管理政府或接管世界。

If you think about it, it’s hardly surprising that progress towards general AI is going to occur in narrow-AI projects that address specific tasks; those tasks give AI researchers something to get their teeth into. (There’s a reason people don’t say, “Staring out the window is the mother of invention.”) At the same time, it’s important to understand how much progress has occurred and where the boundaries are. When AlphaGo defeated Lee Sedol and later all the other top Go players, many people assumed that because a machine had learned from scratch to beat the human race at a task known to be very difficult even for highly intelligent humans, it was the beginning of the end—just a matter of time before AI took over. Even some skeptics may have been convinced when AlphaZero won at chess and shogi as well as Go. But AlphaZero has hard limitations: it works only in the class of discrete, observable, two-player games with known rules. The approach simply won’t work at all for driving, teaching, running a government, or taking over the world.

机器能力的这些严格界限意味着,当人们谈论“机器智商”迅速增长并威胁要超越人类智商时,他们是在胡说八道。智商的概念在应用于人类时是有意义的,这是因为人类的能力往往与广泛的认知活动相关。试图为机器分配智商就像试图让四足动物参加人类十项全能比赛一样。诚然,马可以跑得快,跳得高,但它们在撑杆跳和掷铁饼方面却有很多困难。

These sharp boundaries on machine competence mean that when people talk about “machine IQ” increasing rapidly and threatening to exceed human IQ, they are talking nonsense. To the extent that the concept of IQ makes sense when applied to humans, it’s because human abilities tend to be correlated across a wide range of cognitive activities. Trying to assign an IQ to machines is like trying to get four-legged animals to compete in a human decathlon. True, horses can run fast and jump high, but they have a lot of trouble with pole-vaulting and throwing the discus.

目标和标准模型

Objectives and the standard model

从外部看智能代理,重要的是它从收到的输入流中生成的动作流。从内部看,动作必须由代理程序选择。可以说,人类生来就有一个代理程序,该程序随着时间的推移学习如何在大量任务中合理成功地采取行动。到目前为止,人工智能的情况并非如此:我们不知道如何构建一个可以完成所有任务的通用人工智能程序,因此我们为不同类型的问题构建不同类型的代理程序。我需要至少稍微解释一下这些不同代理程序的工作原理;对于那些感兴趣的人,本书末尾的附录提供了更详细的解释。(指向特定附录的指针以上标形式给出,例如这个A和这个。D 这里主要关注的是标准模型是如何在这些不同类型的代理中实例化——换句话说,如何指定目标并将其传达给代理。

Looking at an intelligent agent from the outside, what matters is the stream of actions it generates from the stream of inputs it receives. From the inside, the actions have to be chosen by an agent program. Humans are born with one agent program, so to speak, and that program learns over time to act reasonably successfully across a huge range of tasks. So far, that is not the case for AI: we don’t know how to build one general-purpose AI program that does everything, so instead we build different types of agent programs for different types of problems. I will need to explain at least a tiny bit about how these different agent programs work; more detailed explanations are given in the appendices at the end of the book for those who are interested. (Pointers to particular appendices are given as superscripts like thisA and this.D) The primary focus here is on how the standard model is instantiated in these various kinds of agents—in other words, how the objective is specified and communicated to the agent.

传达目标的最简单方法是以目标的形式出现。当您进入自动驾驶汽车并触摸屏幕上的“家”图标时,汽车会将此作为其目标,并着手规划和执行路线。世界状态要么满足目标(是的,我在家),要么不满足目标(不,我不住在旧金山机场)。在人工智能研究的经典时期,在 20 世纪 80 年代不确定性成为主要问题之前,大多数人工智能研究都假设世界是完全可观察和确定的,而目标作为指定目标的方式是合理的。有时还有一个成本函数来评估解决方案,因此最佳解决方案是在达到目标的同时最小化总成本的解决方案。对于汽车来说,这可能是内置的——也许路线的成本是时间和燃料消耗的某种固定组合——或者人类可以选择指定两者之间的权衡。

The simplest way to communicate an objective is in the form of a goal. When you get into your self-driving car and touch the “home” icon on the screen, the car takes this as its objective and proceeds to plan and execute a route. A state of the world either satisfies the goal (yes, I’m at home) or it doesn’t (no, I don’t live at the San Francisco Airport). In the classical period of AI research, before uncertainty became a primary issue in the 1980s, most AI research assumed a world that was fully observable and deterministic, and goals made sense as a way to specify objectives. Sometimes there is also a cost function to evaluate solutions, so an optimal solution is one that minimizes total cost while reaching the goal. For the car, this might be built in—perhaps the cost of a route is some fixed combination of the time and fuel consumption—or the human might have the option of specifying the trade-off between the two.

实现这些目标的关键是能够“在心理上模拟”可能行动的影响,这有时称为前瞻搜索。你的自动驾驶汽车有内置地图,所以它知道从旧金山沿海湾大桥向东行驶可以到达奥克兰。起源于 20 世纪60年代的算法通过前瞻和搜索许多可能的动作序列来找到最佳路线。这些算法构成了现代基础设施的一个普遍组成部分:它们不仅提供行车路线,还提供航空旅行解决方案、机器人装配、施工规划和交付物流。通过进行一些修改来处理对手的无礼行为,同样的前瞻思想也适用于井字游戏、国际象棋和围棋等游戏,这些游戏的目标是根据游戏对胜利的特定定义来获胜。

The key to achieving such objectives is the ability to “mentally simulate” the effects of possible actions, sometimes called lookahead search. Your self-driving car has an internal map, so it knows that driving east from San Francisco on the Bay Bridge gets you to Oakland. Algorithms originating in the 1960s50 find optimal routes by looking ahead and searching through many possible action sequences.A These algorithms form a ubiquitous part of modern infrastructure: they provide not just driving directions but also airline travel solutions, robotic assembly, construction planning, and delivery logistics. With some modifications to handle the impertinent behavior of opponents, the same idea of lookahead applies to games such as tic-tac-toe, chess, and Go, where the goal is to win according to the game’s particular definition of winning.

前瞻算法在特定任务上非常有效,但它们不太灵活。例如,AlphaGo“知道”围棋规则,但仅限于它有两个子程序,用 C++ 等传统编程语言编写的程序:一个子程序生成所有可能的合法走法,另一个子程序对目标进行编码,确定给定状态是赢还是输。为了让 AlphaGo 玩不同的游戏,必须有人重写所有这些 C++ 代码。此外,如果你给它一个新目标——比如,访问围绕比邻星运行的系外行星——它会探索数十亿种围棋走法序列,徒劳地试图找到一种实现目标的序列。它无法查看 C++ 代码并确定一个显而易见的事实:没有任何围棋走法序列能让你到达比邻星。AlphaGo 的知识本质上被锁在了一个黑匣子里。

Lookahead algorithms are incredibly effective for their specific tasks, but they are not very flexible. For example, AlphaGo “knows” the rules of Go, but only in the sense that it has two subroutines, written in a traditional programming language such as C++: one subroutine generates all the possible legal moves and the other encodes the goal, determining whether a given state is won or lost. For AlphaGo to play a different game, someone has to rewrite all this C++ code. Moreover, if you give it a new goal—say, visiting the exoplanet that orbits Proxima Centauri—it will explore billions of sequences of Go moves in a vain attempt to find a sequence that achieves the goal. It cannot look inside the C++ code and determine the obvious: no sequence of Go moves gets you to Proxima Centauri. AlphaGo’s knowledge is essentially locked inside a black box.

1958 年,在达特茅斯夏季会议开启人工智能领域两年后,约翰·麦卡锡提出了一种更为通用的打开黑箱的方法:编写通用推理程序,可以吸收任何主题的知识并推理以回答任何可回答的问题。51一种特殊的推理是亚里士多德所建议的实用推理:“执行动作 A、B、C……将实现目标 G。”目标可以是任何事情:确保在我回家之前把房子收拾好,在不失去任何骑士的情况下赢得一场国际象棋比赛,减少 50% 的税收,访问比邻星,等等。麦卡锡的新程序类别很快被称为基于知识的系统。52

In 1958, two years after his Dartmouth summer meeting had initiated the field of artificial intelligence, John McCarthy proposed a much more general approach that opens up the black box: writing general-purpose reasoning programs that can absorb knowledge on any topic and reason with it to answer any answerable question.51 One particular kind of reasoning would be practical reasoning of the kind suggested by Aristotle: “Doing actions A, B, C, . . . will achieve goal G.” The goal could be anything at all: make sure the house is tidy before I get home, win a game of chess without losing either of your knights, reduce my taxes by 50 percent, visit Proxima Centauri, and so on. McCarthy’s new class of programs soon became known as knowledge-based systems.52

要使基于知识的系统成为可能,需要回答两个问题。首先,知识如何存储在计算机中?其次,计算机如何正确地利用这些知识推理得出新的结论?幸运的是,古希腊哲学家(尤其是亚里士多德)早在计算机出现之前就为这些问题提供了基本答案。事实上,如果亚里士多德有机会接触计算机(我想,还有一些电),他很可能成为一名人工智能研究员。麦卡锡重申了亚里士多德的答案,即使用形式逻辑B作为知识和推理的基础。

To make knowledge-based systems possible requires answering two questions. First, how can knowledge be stored in a computer? Second, how can a computer reason correctly with that knowledge to draw new conclusions? Fortunately, ancient Greek philosophers—particularly Aristotle—provided basic answers to these questions long before the advent of computers. In fact, it seems quite likely that, had Aristotle been given access to a computer (and some electricity, I suppose), he would have been an AI researcher. Aristotle’s answer, reiterated by McCarthy, was to use formal logicB as the basis for knowledge and reasoning.

计算机科学中有两种逻辑非常重要。第一种称为命题逻辑布尔逻辑,古希腊人、古代中国和印度哲学家都知道这种逻辑。它与构成计算机芯片电路的与门、非门等逻辑所使用的语言相同。从字面意义上讲,现代 CPU 只是一个用命题逻辑语言编写的非常大的数学表达式——数亿页。第二种逻辑,也是麦卡锡提议用于人工智能的逻辑,称为一阶逻辑。一阶逻辑的语言比命题逻辑更具表达力,这意味着有些东西用一阶逻辑可以很容易地表达,但在命题逻辑中却很难甚至不可能写出来。例如,围棋规则在一阶逻辑中大约需要一页,但在命题逻辑中却需要数百万页。同样,我们可以轻松地表达有关国际象棋、英国公民身份、税法、买卖、搬家、绘画、烹饪以及我们常识世界中的许多其他方面的知识。

There are two kinds of logic that really matter in computer science. The first, called propositional or Boolean logic, was known to the Greeks as well as to ancient Chinese and Indian philosophers. It is the same language of AND gates, NOT gates, and so on that makes up the circuitry of computer chips. In a very literal sense, a modern CPU is just a very large mathematical expression—hundreds of millions of pages—written in the language of propositional logic. The second kind of logic, and the one that McCarthy proposed to use for AI, is called first-order logic.B The language of first-order logic is far more expressive than propositional logic, which means that there are things that can be expressed very easily in first-order logic that are painful or impossible to write in propositional logic. For example, the rules of Go take about a page in first-order logic but millions of pages in propositional logic. Similarly, we can easily express knowledge about chess, British citizenship, tax law, buying and selling, moving, painting, cooking, and many other aspects of our commonsense world.

因此,原则上,使用一阶逻辑进行推理的能力使我们在实现通用智能方面取得了长足进步。1930 年,才华横溢的奥地利逻辑学家库尔特·哥德尔发表了他著名的完备性定理53,证明了存在一种具有以下属性的算法:54

In principle, then, the ability to reason with first-order logic gets us a long way towards general-purpose intelligence. In 1930, the brilliant Austrian logician Kurt Gödel had published his famous completeness theorem,53 proving that there is an algorithm with the following property:54

对于任何知识集合任何能够用一阶逻辑表达的问题,如果有答案,算法就会告诉我们答案。

For any collection of knowledge and any question expressible in first-order logic, the algorithm will tell us the answer to the question if there is one.

这是一项相当不可思议的保证。例如,这意味着我们可以告诉系统围棋规则,它会告诉我们(如果我们等待足够长的时间)是否有一个开局可以赢得比赛。我们可以告诉它有关当地地理的事实,它会告诉我们去机场的路。我们可以告诉它有关几何、运动和餐具的事实,它会告诉机器人如何摆放餐桌。更多一般来说,给定任何可实现的目标和对其行为影响的充分了解,代理可以使用算法来构建一个可以执行的计划来实现该目标。

This is a pretty incredible guarantee. It means, for example, that we can tell the system the rules of Go and it will tell us (if we wait long enough) whether there is an opening move that wins the game. We can tell it facts about local geography, and it will tell us the way to the airport. We can tell it facts about geometry and motion and utensils, and it will tell the robot how to lay the table for dinner. More generally, given any achievable goal and sufficient knowledge of the effects of its actions, an agent can use the algorithm to construct a plan that it can execute to achieve the goal.

图 4:Shakey 机器人,约 1970 年。背景是 Shakey 在房间内推动的一些物体。

FIGURE 4: Shakey the robot, circa 1970. In the background are some of the objects that Shakey pushed around in its suite of rooms.

必须指出的是,哥德尔实际上并没有提供算法;他只是证明了算法的存在。20 世纪 60 年代初,逻辑推理的真正算法开始出现,55麦卡锡的基于逻辑的通用智能系统梦想似乎触手可及。世界上第一个大型移动机器人项目,SRI 的 Shakey 项目,就是基于逻辑推理的(见图4)。Shakey 从人类设计师那里获得一个目标,使用视觉算法创建描述当前情况的逻辑断言,进行逻辑推理以得出实现目标的保证计划,然后执行该计划。Shakey 是亚里士多德对人类认知和行为的分析至少部分正确的“活生生”的证明。

It must be said that Gödel did not actually provide an algorithm; he merely proved that one existed. In the early 1960s, real algorithms for logical reasoning began to appear,55 and McCarthy’s dream of generally intelligent systems based on logic seemed within reach. The first major mobile robot project in the world, SRI’s Shakey project, was based on logical reasoning (see figure 4). Shakey received a goal from its human designers, used vision algorithms to create logical assertions describing the current situation, performed logical inference to derive a guaranteed plan to achieve the goal, and then executed the plan. Shakey was “living” proof that Aristotle’s analysis of human cognition and action was at least partially correct.

不幸的是,亚里士多德(和麦卡锡)的分析远非完全正确。主要问题是无知——不是,我赶紧补充说,这是亚里士多德或麦卡锡的观点,也是所有人类和机器的观点,现在和将来都是如此。我们的知识中只有很少一部分是绝对确定的。特别是,我们对未来知之甚少。无知对于纯逻辑系统来说是一个无法克服的问题。如果我问:“如果我在航班起飞前三个小时出发,我能准时到达机场吗?”或“我能通过购买中奖彩票然后用奖金买下房子来获得房子吗?”在每种情况下,正确的答案都是“我不知道”。原因是,对于每个问题,是和否在逻辑上都是可能的。实际上,除非已经知道答案,否则永远无法绝对确定任何经验问题。56幸运的是,确定性对于行动来说是完全不必要的:我们只需要知道哪种行动最好,而不是哪种行动肯定会成功。

Unfortunately, Aristotle’s (and McCarthy’s) analysis was far from being completely correct. The main problem is ignorance—not, I hasten to add, on the part of Aristotle or McCarthy, but on the part of all humans and machines, present and future. Very little of our knowledge is absolutely certain. In particular, we don’t know very much about the future. Ignorance is just an insuperable problem for a purely logical system. If I ask, “Will I get to the airport on time, if I leave three hours before my flight?” or “Can I obtain a house by buying a winning lottery ticket and then buying the house with the proceeds?” the correct answer will be, in each case, “I don’t know.” The reason is that, for each question, both yes and no are logically possible. As a practical matter, one can never be absolutely certain of any empirical question unless the answer is already known.56 Fortunately, certainty is completely unnecessary for action: we just need to know which action is best, not which action is certain to succeed.

不确定性意味着“赋予机器的目的”通常不能是一个精确划定的目标,不能不惜一切代价实现。不再存在“实现目标的行动序列”,因为任何行动序列都会有多种可能的结果,其中一些结果无法实现目标。成功的可能性确实很重要:提前三个小时出发前往机场可能意味着你不会错过航班,购买彩票可能意味着你会赢得足够的钱来买一套新房子,但这些可能非常不同不能通过寻找最大化实现目标概率的计划来挽救目标。最大化及时到达机场赶上飞机概率的计划可能涉及提前几天离开家、组织武装护送、安排多种替代交通工具以防其他交通工具发生故障等。不可避免的是,人们必须考虑不同结果的相对可取性及其可能性。

Uncertainty means that the “purpose put into the machine” cannot, in general, be a precisely delineated goal, to be achieved at all costs. There is no longer such a thing as a “sequence of actions that achieves the goal,” because any sequence of actions will have multiple possible outcomes, some of which won’t achieve the goal. The likelihood of success really matters: leaving for the airport three hours in advance of your flight may mean that you won’t miss the flight and buying a lottery ticket may mean that you’ll win enough to buy a new house, but these are very different mays. Goals cannot be rescued by looking for plans that maximize the probability of achieving the goal. A plan that maximizes the probability of getting to the airport in time to catch a flight might involve leaving home days in advance, organizing an armed escort, lining up many alternative means of transport in case the others break down, and so on. Inevitably, one must take into account the relative desirabilities of different outcomes as well as their likelihoods.

那么,我们可以使用效用函数来描述不同结果或状态序列的可取性,而不是目标。通常,状态序列的效用表示为序列中每个状态的奖励总和。给定一个由效用定义的目的或奖励函数,机器旨在产生最大化其预期效用或预期奖励总和的行为,这些行为是按概率加权的可能结果的平均值。现代人工智能在一定程度上是麦卡锡梦想的重启,只不过用效用和概率代替了目标和逻辑。

Instead of a goal, then, we could use a utility function to describe the desirability of different outcomes or sequences of states. Often, the utility of a sequence of states is expressed as a sum of rewards for each of the states in the sequence. Given a purpose defined by a utility or reward function, the machine aims to produce behavior that maximizes its expected utility or expected sum of rewards, averaged over the possible outcomes weighted by their probabilities. Modern AI is partly a rebooting of McCarthy’s dream, except with utilities and probabilities instead of goals and logic.

伟大的法国数学家皮埃尔-西蒙·拉普拉斯 (Pierre-Simon Laplace) 于 1814 年写道:“概率论不过是将常识简化为微积分。” 57然而,直到 20 世纪 80 年代,才开发出一种实用的形式语言和推理算法来处理概率知识。这就是 Judea Pearl 引入的贝叶斯网络语言C。粗略说,贝叶斯网络是命题逻辑的概率表亲。一阶逻辑也有概率表亲,包括贝叶斯逻辑58和各种各样的概率编程语言。

Pierre-Simon Laplace, the great French mathematician, wrote in 1814, “The theory of probabilities is just common sense reduced to calculus.”57 It was not until the 1980s, however, that a practical formal language and reasoning algorithms were developed for probabilistic knowledge. This was the language of Bayesian networks,C introduced by Judea Pearl. Roughly speaking, Bayesian networks are the probabilistic cousins of propositional logic. There are also probabilistic cousins of first-order logic, including Bayesian logic58 and a wide variety of probabilistic programming languages.

贝叶斯网络和贝叶斯逻辑以英国牧师托马斯·贝叶斯的名字命名,他对现代思想的持久贡献——现称为贝叶斯定理——由他的朋友理查德·普莱斯于 1763 年(即他去世后不久)发表。59定理的现代形式由拉普拉斯提出,以一种非常简单的方式描述了先验概率——人们对一组可能假设的初始信念程度——如何通过观察到一些证据而变成后验概率。随着更多新证据的出现,后验成为新的先验,贝叶斯更新过程无限重复。这个过程是如此基础,以至于现代将理性视为最大化预期效用的理念有时被称为贝叶斯理性。它假设理性主体可以根据其所有过去经验,获得有关世界可能当前状态以及对未来假设的后验概率分布。

Bayesian networks and Bayesian logic are named after the Reverend Thomas Bayes, a British clergyman whose lasting contribution to modern thought—now known as Bayes’ theorem—was published in 1763, shortly after his death, by his friend Richard Price.59 In its modern form, as suggested by Laplace, the theorem describes in a very simple way how a prior probability—the initial degree of belief one has in a set of possible hypotheses—becomes a posterior probability as a result of observing some evidence. As more new evidence arrives, the posterior becomes the new prior and the process of Bayesian updating repeats ad infinitum. This process is so fundamental that the modern idea of rationality as maximization of expected utility is sometimes called Bayesian rationality. It assumes that a rational agent has access to a posterior probability distribution over possible current states of the world, as well as over hypotheses about the future, based on all its past experience.

运筹学、控制理论和人工智能领域的研究人员也开发了各种用于在不确定情况下进行决策的算法,其中一些算法可以追溯到 20 世纪 50 年代。这些所谓的“动态规划”算法是前瞻算法的概率表亲搜索和规划,并能为金融、物流、运输等各种实际问题生成最优或接近最优的行为,其中不确定性起着重要作用。C目的以奖励函数的形式放入这些机器中,输出是一种策略,该策略为代理可能进入的每个可能状态指定一个动作。

Researchers in operations research, control theory, and AI have also developed a variety of algorithms for decision making under uncertainty, some dating back to the 1950s. These so-called “dynamic programming” algorithms are the probabilistic cousins of lookahead search and planning and can generate optimal or near-optimal behavior for all sorts of practical problems in finance, logistics, transportation, and so on, where uncertainty plays a significant role.C The purpose is put into these machines in the form of a reward function, and the output is a policy that specifies an action for every possible state the agent could get itself into.

对于西洋双陆棋和围棋等复杂问题,由于状态数量巨大,而且只在游戏结束时才会有奖励,前瞻搜索将无法发挥作用。为此,人工智能研究人员开发了一种名为强化学习(简称 RL)的方法。强化学习算法从环境中的奖励信号的直接体验中学习,就像婴儿从直立的正奖励和跌倒的负奖励中学会站起来一样。与动态规划算法一样,强化学习算法的目的是奖励函数,算法会学习状态值(有时是动作值)的估计量。该估计量可以与相对短视的前瞻搜索相结合,以产生高度称职的行为。

For complex problems such as backgammon and Go, where the number of states is enormous and the reward comes only at the end of the game, lookahead search won’t work. Instead, AI researchers have developed a method called reinforcement learning, or RL for short. RL algorithms learn from direct experience of reward signals in the environment, much as a baby learns to stand up from the positive reward of being upright and the negative reward of falling over. As with dynamic programming algorithms, the purpose put into an RL algorithm is the reward function, and the algorithm learns an estimator for the value of states (or sometimes the value of actions). This estimator can be combined with relatively myopic lookahead search to generate highly competent behavior.

第一个成功的强化学习系统是亚瑟·塞缪尔 (Arthur Samuel) 的跳棋程序,它在 1956 年电视上展示时引起了轰动。该程序基本上从零开始学习,通过与自己对弈并观察获胜和失败的回报。60 1992年,Gerry Tesauro 将同样的想法应用于西洋双陆棋游戏,在 1,500,000 场比赛后达到了世界冠军水平。612016 年开始,DeepMind 的 AlphaGo 及其后代使用强化学习和自我对弈击败了围棋、国际象棋和将棋的顶级人类玩家。

The first successful reinforcement learning system was Arthur Samuel’s checkers program, which created a sensation when it was demonstrated on television in 1956. The program learned essentially from scratch, by playing against itself and observing the rewards of winning and losing.60 In 1992, Gerry Tesauro applied the same idea to the game of backgammon, achieving world-champion-level play after 1,500,000 games.61 Beginning in 2016, DeepMind’s AlphaGo and its descendants used reinforcement learning and self-play to defeat the best human players at Go, chess, and shogi.

强化学习算法还可以学习如何根据原始感知输入选择动作。例如,DeepMind 的 DQN 系统从头开始学习玩 49 种不同的 Atari 视频游戏,包括 Pong、Freeway 和 Space Invaders。62仅使用屏幕像素作为输入,将游戏分数作为奖励信号。在大多数游戏中,DQN 的学习效果都比职业人类玩家——尽管 DQN 没有时间、空间、物体、运动、速度或射击的先验概念。除了获胜之外,很难弄清楚 DQN 实际上在做什么。

Reinforcement learning algorithms can also learn how to select actions based on raw perceptual input. For example, DeepMind’s DQN system learned to play forty-nine different Atari video games entirely from scratch—including Pong, Freeway, and Space Invaders.62 It used only the screen pixels as input and the game score as a reward signal. In most of the games, DQN learned to play better than a professional human player—despite the fact that DQN has no a priori notion of time, space, objects, motion, velocity, or shooting. It is quite hard to work out what DQN is actually doing, besides winning.

如果一个新生儿在出生第一天就学会了玩几十种超人水平的电子游戏,或者成为围棋、国际象棋和将棋的世界冠军,我们可能会怀疑这是恶魔附身或外星人干预。但请记住,所有这些任务都比现实世界简单得多:它们是完全可观察的,涉及的时间范围很短,状态空间相对较小,规则简单可预测。放宽任何这些条件都意味着标准方法将失败。

If a newborn baby learned to play dozens of video games at superhuman levels on its first day of life, or became world champion at Go, chess, and shogi, we might suspect demonic possession or alien intervention. Remember, however, that all these tasks are much simpler than the real world: they are fully observable, they involve short time horizons, and they have relatively small state spaces and simple, predictable rules. Relaxing any of these conditions means that the standard methods will fail.

另一方面,当前的研究正是为了超越标准方法,使人工智能系统能够在更大范围的环境中运行。例如,在我写上一段的那天,OpenAI 宣布其由五个人工智能程序组成的团队已经学会在 Dota 2 游戏中击败经验丰富的人类团队。(对于包括我在内的外行人来说:Dota 2 是《魔兽争霸》系列实时战略游戏《远古防御》的更新版本;它是目前最赚钱、最具竞争力的电子竞技,奖金高达数百万美元。)Dota 2 涉及沟通、团队合作和准连续的时间和空间。游戏持续数万个时间步骤,某种程度的行为层次组织似乎是必不可少的。比尔盖茨将这一声明描述为“推动人工智能发展的一大里程碑”。63几个月后,该程序的更新版本击败了世界顶级专业 Dota 2 团队。64

Current research, on the other hand, is aimed precisely at going beyond standard methods so that AI systems can operate in larger classes of environments. On the day I wrote the preceding paragraph, for example, OpenAI announced that its team of five AI programs had learned to beat experienced human teams at the game Dota 2. (For the uninitiated, who include me: Dota 2 is an updated version of Defense of the Ancients, a real-time strategy game in the Warcraft family; it is currently the most lucrative and competitive e-sport, with prizes in the millions of dollars.) Dota 2 involves communication, teamwork, and quasi-continuous time and space. Games last for tens of thousands of time steps, and some degree of hierarchical organization of behavior seems to be essential. Bill Gates described the announcement as “a huge milestone in advancing artificial intelligence.”63 A few months later, an updated version of the program defeated the world’s top professional Dota 2 team.64

围棋和 Dota 2 等游戏是强化学习方法的良好试验场,因为奖励函数是游戏规则的一部分。然而,现实世界并不那么方便,有几十起奖励定义错误导致怪异和意料之外的行为的案例。65有些是无害的,比如模拟进化系统,它本应进化出快速移动的生物,但实际上却产生了巨大的生物66其他的就没那么无害了,比如社交媒体点击优化器,它似乎把我们的世界弄得一团糟。

Games such as Go and Dota 2 are a good testing ground for reinforcement learning methods because the reward function comes with the rules of the game. The real world is less convenient, however, and there have been dozens of cases in which faulty definitions of rewards led to weird and unanticipated behaviors.65 Some are innocuous, like the simulated evolution system that was supposed to evolve fast-moving creatures but in fact produced creatures that were enormously tall and moved fast by falling over.66 Others are less innocuous, like the social-media click-through optimizers that seem to be making a fine mess of our world.

我将要讨论的最后一类代理程序是最简单的:将感知直接与行动联系起来,而无需任何中间的思考或推理。在人工智能中,我们将这种程序称为反射代理——指的是人类和动物表现出的低级神经反射,这些反射不受思想的影响。67例如,人类的眨眼反射将视觉系统中低级处理电路的输出直接连接到控制眼睑的运动区域,因此视野中任何快速出现的区域都会引起剧烈的眨眼。现在你可以试着(不要太用力)用手指戳自己的眼睛来测试它。我们可以将这个反射系统视为以下形式的简单“规则”:

The final category of agent program I will consider is the simplest: programs that connect perception directly to action, without any intermediate deliberation or reasoning. In AI, we call this kind of program a reflex agent—a reference to the low-level neural reflexes exhibited by humans and animals, which are not mediated by thought.67 For example, the human blinking reflex connects the outputs of low-level processing circuits in the visual system directly to the motor area that controls the eyelids, so that any rapidly looming region in the visual field causes a hard blink. You can test it now by trying (not too hard) to poke yourself in the eye with your finger. We can think of this reflex system as a simple “rule” of the following form:

如果 <视野中出现快速逼近区域> 则 <闪烁>。

if <rapidly looming region in visual field> then <blink>.

眨眼反射并不“知道自己在做什么”:目的(保护眼球免受异物伤害)没有体现;知识(快速逼近的区域对应于接近眼睛的物体,接近眼睛的物体可能会损坏眼睛)也没有体现。因此,当你的非反射部分想要滴眼药水时,反射部分仍然会眨眼。

The blinking reflex does not “know what it’s doing”: the objective (of shielding the eyeball from foreign objects) is nowhere represented; the knowledge (that a rapidly looming region corresponds to an object approaching the eye, and that an object approaching the eye might damage it) is nowhere represented. Thus, when the non-reflex part of you wants to put in eye drops, the reflex part still blinks.

另一个常见的反应是紧急刹车——当前车意外停车或行人踏上道路时。快速决定是否需要刹车并不容易:2018 年,一辆自动驾驶测试车撞死了一名行人,Uber 解释说,“当车辆处于计算机控制下时,不会启用紧急刹车操作,以降低车辆行为不稳定的可能性。” 68在这里,人类设计师的目标很明确——不要撞死行人——但智能体的策略(如果被激活)执行得不正确。同样,目标没有体现在智能体中:如今没有一辆自动驾驶汽车知道人们不喜欢被撞死。

Another familiar reflex is emergency braking—when the car in front stops unexpectedly or a pedestrian steps into the road. Quickly deciding whether braking is required is not easy: when a test vehicle in autonomous mode killed a pedestrian in 2018, Uber explained that “emergency braking maneuvers are not enabled while the vehicle is under computer control, to reduce the potential for erratic vehicle behavior.”68 Here, the human designer’s objective is clear—don’t kill pedestrians—but the agent’s policy (had it been activated) implements it incorrectly. Again, the objective is not represented in the agent: no autonomous vehicle today knows that people don’t like to be killed.

反射动作在更常规的任务中也发挥着作用,比如保持在车道内:当汽车稍微偏离理想车道位置时,简单的反馈控制系统可以将方向盘轻推向相反方向,以纠正漂移。轻推的大小取决于汽车漂移的程度。这类控制系统通常设计为最小化随时间推移而增加的跟踪误差的平方。设计者推导出一个反馈控制律,在某些关于速度和道路曲率的假设下,该律大致实现了这种最小化。69你站立时,类似的系统一直在运转;如果它停止工作,你会在几秒钟内摔倒。与眨眼反射一样,很难关闭这个机制并让自己摔倒。

Reflex actions also play a role in more routine tasks such as staying in lane: as the car drifts ever so slightly out of the ideal lane position, a simple feedback control system can nudge the steering wheel in the opposite direction to correct the drift. The size of the nudge would depend on how far the car drifted. These kinds of control systems are usually designed to minimize the square of the tracking error added up over time. The designer derives a feedback control law that, under certain assumptions about speed and road curvature, approximately implements this minimization.69 A similar system is operating all the time while you are standing up; if it were to stop working, you’d fall over within a few seconds. As with the blinking reflex, it’s quite hard to turn this mechanism off and allow yourself to fall over.

然后,反射代理会实现设计者的目标,但不知道目标是什么,也不知道为什么会以某种方式行动。这意味着它们实际上无法为自己做出决定;其他人(通常是人类设计者或生物进化过程)必须提前决定一切。除了井字游戏或紧急刹车等非常简单的任务外,很难通过手动编程创建一个好的反射代理。即使在这些情况下,反射代理也极其不灵活,当情况表明实施的策略不再合适时,它无法改变其行为。

Reflex agents, then, implement a designer’s objective, but do not know what the objective is or why they are acting in a certain way. This means they cannot really make decisions for themselves; someone else, typically the human designer or perhaps the process of biological evolution, has to decide everything in advance. It is very hard to create a good reflex agent by manual programming except for very simple tasks such as tic-tac-toe or emergency braking. Even in those cases, the reflex agent is extremely inflexible and cannot change its behavior when circumstances indicate that the implemented policy is no longer appropriate.

创建更强大的反射代理的一种可能方法是通过从示例中学习的过程。D人类可以提供决策问题的示例以及在每种情况下应做出的正确决策,而不是指定行为规则,或提供奖励函数或目标。例如,我们可以通过提供法语句子示例以及正确的英语翻译来创建法语到英语的翻译代理。(幸运的是,加拿大和欧盟议会每年都会生成数百万个此类示例。)然后,监督学习算法会处理这些示例以生成一条复杂规则,该规则将任何法语句子作为输入并生成英文翻译。目前,机器翻译的冠军学习算法是一种所谓的深度学习,它以人工神经网络的形式生成规则,该神经网络具有数百层和数百万个参数。其他深度学习算法已被证明非常擅长对图像中的对象进行分类以及识别语音信号中的单词。机器翻译、语音识别和视觉对象识别是人工智能中最重要的三个子领域,这就是为什么人们对深度学习的前景如此兴奋的原因。

One possible way to create more powerful reflex agents is through a process of learning from examples.D Rather than specifying a rule for how to behave, or supplying a reward function or a goal, a human can supply examples of decision problems along with the correct decision to make in each case. For example, we can create a French-to-English translation agent by supplying examples of French sentences along with the correct English translations. (Fortunately, the Canadian and EU parliaments generate millions of such examples every year.) Then a supervised learning algorithm processes the examples to produce a complex rule that takes any French sentence as input and produces an English translation. The current champion learning algorithm for machine translation is a form of so-called deep learning, and it produces a rule in the form of an artificial neural network with hundreds of layers and millions of parameters.D Other deep learning algorithms have turned out to be very good at classifying the objects in images and recognizing the words in a speech signal. Machine translation, speech recognition, and visual object recognition are three of the most important subfields in AI, which is why there has been so much excitement about the prospects for deep learning.

关于深度学习是否会直接导致人类水平的人工智能,人们可以无休止地争论。我自己的观点(稍后我会解释)是,它远远没有达到所需的水平,现在让我们关注这些方法如何适应人工智能的标准模型,即算法优化固定目标。对于深度学习,或者对于任何监督学习算法,“赋予机器的目的”通常是最大化预测准确性——或者,等效地,最小化错误。这似乎很明显,但实际上有两种理解方式,取决于学习规则将在整个系统中发挥的作用。第一个角色是纯粹的感知角色:网络处理感官输入,并以感知概率估计的形式向系统其余部分提供信息。如果它是一种物体识别算法,也许它会说“70% 的概率是诺福克梗,30% 的概率是诺里奇梗。” 70系统的其余部分根据这些信息决定采取的外部行动。这个纯粹的感知目标在以下意义上是没有问题的:即使是“安全”的超级智能 AI 系统(与基于标准模型的“不安全”系统不同),也需要使其感知系统尽可能准确且校准良好。

One can argue almost endlessly about whether deep learning will lead directly to human-level AI. My own view, which I will explain later, is that it falls far short of what is needed,D but for now let’s focus on how such methods fit into the standard model of AI, where an algorithm optimizes a fixed objective. For deep learning, or indeed for any supervised learning algorithm, the “purpose put into the machine” is usually to maximize predictive accuracy—or, equivalently, to minimize error. That much seems obvious, but there are actually two ways to understand it, depending on the role that the learned rule is going to play in the overall system. The first role is a purely perceptual role: the network processes the sensory input and provides information to the rest of the system in the form of probability estimates for what it’s perceiving. If it’s an object recognition algorithm, maybe it says “70 percent probability it’s a Norfolk terrier, 30 percent it’s a Norwich terrier.”70 The rest of the system decides on an external action to take based on this information. This purely perceptual objective is unproblematic in the following sense: even a “safe” superintelligent AI system, as opposed to an “unsafe” one based on the standard model, needs to have its perception system as accurate and well calibrated as possible.

当我们从纯粹的感知角色转变为决策角色时,问题就出现了。例如,一个经过训练的识别物体的网络可能会自动为网站或社交媒体帐户。发布这些标签是一种会产生后果的行为。每个标记操作都需要实际的分类决策,除非每个决策都保证完美无缺,否则人类设计师必须提供一个损失函数,说明将 A 类对象错误分类为 B 类对象的成本。这就是谷歌在处理大猩猩时遇到的不幸问题。2015 年,一位名叫 Jacky Alciné 的软件工程师在 Twitter 上抱怨说,Google Photos 图像标记服务将他和他的朋友标记为大猩猩。71虽然目前尚不清楚这个错误究竟是如何发生的,但几乎可以肯定的是,谷歌的机器学习算法旨在最小化固定的、明确的损失函数——而且,该函数为任何错误分配相同的成本。换句话说,它假设将一个人错误分类为大猩猩的成本与将诺福克梗错误分类为诺里奇梗的成本相同。显然,这不是谷歌(或其用户)的真正损失函数,随后发生的公关灾难就证明了这一点。

The problem comes when we move from a purely perceptual role to a decision-making role. For example, a trained network for recognizing objects might automatically generate labels for images on a Web site or social-media account. Posting those labels is an action with consequences. Each labeling action requires an actual classification decision, and unless every decision is guaranteed to be perfect, the human designer must supply a loss function that spells out the cost of misclassifying an object of type A as an object of type B. And that’s how Google had an unfortunate problem with gorillas. In 2015, a software engineer named Jacky Alciné complained on Twitter that the Google Photos image-labeling service had labeled him and his friend as gorillas.71 While it is unclear how exactly this error occurred, it is almost certain that Google’s machine learning algorithm was designed to minimize a fixed, definite loss function—moreover, one that assigned equal cost to any error. In other words, it assumed that the cost of misclassifying a person as a gorilla was the same as the cost of misclassifying a Norfolk terrier as a Norwich terrier. Clearly, this is not Google’s (or their users’) true loss function, as was illustrated by the public relations disaster that ensued.

由于可能的图像标签有数千种,将一个类别误分类为另一个类别可能会产生数百万种不同的成本。即使谷歌尝试过,也很难预先指定所有这些数字。相反,正确的做法是承认真正的误分类成本存在不确定性,并设计一种对成本和成本不确定性足够敏感的学习和分类算法。这样的算法可能会偶尔问谷歌设计师这样的问题:“将狗误分类为猫和将人误分类为动物,哪个更糟糕?”此外,如果误分类成本存在很大的不确定性,算法很可能会拒绝标记某些图像。

Since there are thousands of possible image labels, there are millions of potentially distinct costs associated with misclassifying one category as another. Even if it had tried, Google would have found it very difficult to specify all these numbers up front. Instead, the right thing to do would be to acknowledge the uncertainty about the true misclassification costs and to design a learning and classification algorithm that was suitably sensitive to costs and uncertainty about costs. Such an algorithm might occasionally ask the Google designer questions such as “Which is worse, misclassifying a dog as a cat or misclassifying a person as an animal?” In addition, if there is significant uncertainty about misclassification costs, the algorithm might well refuse to label some images.

到 2018 年初,据报道 Google Photos 确实拒绝对大猩猩的照片进行分类。当看到一张非常清晰的大猩猩和两只幼崽的照片时,它会说:“嗯……还没看清楚。” 72

By early 2018, it was reported that Google Photos does refuse to classify a photo of a gorilla. Given a very clear image of a gorilla with two babies, it says, “Hmm . . . not seeing this clearly yet.”72

我并不想说人工智能采用标准模型在当时是一个糟糕的选择。大量出色的成果已经消失在逻辑、概率和学习系统中开发模型的各种实例。由此产生的许多系统非常有用;正如我们将在下一章中看到的那样,未来还有更多。另一方面,我们不能继续依赖我们通常的做法,即通过反复试验来消除目标函数中的主要错误:智能越来越高、全球影响力越来越大的机器不会让我们有这种奢侈。

I don’t wish to suggest that AI’s adoption of the standard model was a poor choice at the time. A great deal of brilliant work has gone into developing the various instantiations of the model in logical, probabilistic, and learning systems. Many of the resulting systems are very useful; as we will see in the next chapter, there is much more to come. On the other hand, we cannot continue to rely on our usual practice of ironing out the major errors in an objective function by trial and error: machines of increasing intelligence and increasingly global impact will not allow us that luxury.

3

3

人工智能未来将会如何发展?

HOW MIGHT AI PROGRESS IN THE FUTURE?

不久的将来

The Near Future

1997 年 5 月 3 日,IBM 制造的国际象棋计算机“深蓝”与国际象棋世界冠军、可能是历史上最优秀的人类棋手加里·卡斯帕罗夫展开了一场国际象棋比赛。《新闻周刊》将这场比赛称为“大脑的最后一战”。5 月 11 日,比赛以 2.5–2.5 打平,深蓝在最后一局中击败了卡斯帕罗夫。媒体为之疯狂。IBM 的市值一夜之间增加了 180 亿美元。从各方面来看,人工智能都取得了巨大突破。

On May 3, 1997, a chess match began between Deep Blue, a chess computer built by IBM, and Garry Kasparov, the world chess champion and possibly the best human player in history. Newsweek billed the match as “The Brain’s Last Stand.” On May 11, with the match tied at 2½–2½, Deep Blue defeated Kasparov in the final game. The media went berserk. The market capitalization of IBM increased by $18 billion overnight. AI had, by all accounts, achieved a massive breakthrough.

从人工智能研究的角度来看,这场比赛根本没有什么突破。深蓝的胜利虽然令人印象深刻,但只是延续了几十年来一直存在的趋势。国际象棋算法的基本设计由克劳德·香农于 1950 年提出,1并在 20 世纪 60 年代初得到了重大改进。此后,最佳程序的国际象棋等级稳步提高,主要是由于计算机速度更快,程序能够预测得更远。1994 年,2 Peter Norvig 和我绘制了最佳程序的数值等级图从 1965 年开始,国际象棋程序的评分就达到了历史最高,而卡斯帕罗夫的评分为 2805。从 1965 年开始,评分从 1400 开始,30 年来一直呈近乎完美的直线上升。从 1994 年开始,按照这个趋势推算,计算机将在 1997 年击败卡斯帕罗夫——而这一幕也正是在那个时候发生的。

From the point of view of AI research, the match represented no breakthrough at all. Deep Blue’s victory, impressive as it was, merely continued a trend that had been visible for decades. The basic design for chess-playing algorithms was laid out in 1950 by Claude Shannon,1 with major improvements in the early 1960s. After that, the chess ratings of the best programs improved steadily, mainly as a result of faster computers that allowed programs to look further ahead. In 1994,2 Peter Norvig and I charted the numerical ratings of the best chess programs from 1965 onwards, on a scale where Kasparov’s rating was 2805. The ratings started at 1400 in 1965 and improved in an almost perfect straight line for thirty years. Extrapolating the line forward from 1994 predicts that computers would be able to defeat Kasparov in 1997—exactly when it happened.

因此,对于人工智能研究人员来说,真正的突破发生在“深蓝”进入公众视野的三四十年前同样,深度卷积网络在成为头条新闻的二十多年前就已经存在,并且所有的数学原理都已经完全解决。

For AI researchers, then, the real breakthroughs happened thirty or forty years before Deep Blue burst into the public’s consciousness. Similarly, deep convolutional networks existed, with all the mathematics fully worked out, more than twenty years before they began to create headlines.

公众从媒体上看到的人工智能突破——对人类的惊人胜利、机器人成为沙特阿拉伯公民等等——与全球研究实验室中真正发生的事情几乎没有关系。在实验室内部,研究涉及大量的思考、讨论和在白板上写数学公式。想法不断产生、被抛弃和重新发现。一个好的想法——一个真正的突破——往往在当时不会被注意到,可能直到后来才被理解为为人工智能的重大进步提供了基础,也许是在某个更方便的时候有人重新发明了它。想法需要先在简单的问题上进行试验,以证明基本直觉是正确的,然后在更困难的问题上进行试验,看看它们的扩展性如何。通常,一个想法本身无法提供能力的实质性改进,它必须等待另一个想法的出现,以便两者的结合能够展示价值。

The view of AI breakthroughs that the public gets from the media—stunning victories over humans, robots becoming citizens of Saudi Arabia, and so on—bears very little relation to what really happens in the world’s research labs. Inside the lab, research involves a lot of thinking and talking and writing mathematical formulas on whiteboards. Ideas are constantly being generated, abandoned, and rediscovered. A good idea—a real breakthrough—will often go unnoticed at the time and may only later be understood as having provided the basis for a substantial advance in AI, perhaps when someone reinvents it at a more convenient time. Ideas are tried out, initially on simple problems to show that the basic intuitions are correct and then on harder problems to see how well they scale up. Often, an idea will fail by itself to provide a substantial improvement in capabilities, and it has to wait for another idea to come along so that the combination of the two can demonstrate value.

所有这些活动从外部来看都是完全不可见的。在实验室之外的世界,只有当想法逐渐积累,其有效性证据跨越了门槛时,人工智能才会变得可见:这个门槛值得投入资金和工程努力来创造一种新的商业产品或一个令人印象深刻的演示。然后媒体宣布取得了突破。

All this activity is completely invisible from the outside. In the world beyond the lab, AI becomes visible only when the gradual accumulation of ideas and the evidence for their validity crosses a threshold: the point where it becomes worthwhile to invest money and engineering effort to create a new commercial product or an impressive demonstration. Then the media announce that a breakthrough has occurred.

因此,我们可以预见到,许多其他的想法全球研究实验室中酝酿的新技术将在未来几年内跨过商业应用的门槛。随着商业投资率的提高以及世界对人工智能应用的接受度越来越高,这种情况将越来越频繁地发生。本章提供了一些我们可以看到的即将发生的事情的样本。

One can expect, then, that many other ideas that have been gestating in the world’s research labs will cross the threshold of commercial applicability over the next few years. This will happen more and more frequently as the rate of commercial investment increases and as the world becomes more and more receptive to applications of AI. This chapter provides a sampling of what we can see coming down the pipe.

在此过程中,我会提到这些技术进步的一些缺点。您可能还能想到更多,但别担心。我将在下一章中讨论这些。

Along the way, I’ll mention some of the drawbacks of these technological advances. You will probably be able to think of many more, but don’t worry. I’ll get to those in the next chapter.

人工智能生态系统

The AI ecosystem

最初,大多数计算机运行的环境基本上是无形的、虚空的:它们唯一的输入来自打孔卡,唯一的输出方式是在行式打印机上打印字符。也许正是出于这个原因,大多数研究人员将智能机器视为问答器;直到 20 世纪 80 年代,将机器视为在环境中感知和行动的代理的观点才得到广泛传播。

In the beginning, the environment in which most computers operated was essentially formless and void: their only input came from punched cards and their only method of output was to print characters on a line printer. Perhaps for this reason, most researchers viewed intelligent machines as question-answerers; the view of machines as agents perceiving and acting in an environment did not become widespread until the 1980s.

20 世纪 90 年代万维网的出现为智能机器开辟了一个全新的世界。一个新词“软机器人”被创造出来,用来描述完全在网络等软件环境中运行的软件“机器人”。软机器人(后来被称为机器人)可以感知网页并通过发出字符序列、URL 等来采取行动。

The advent of the World Wide Web in the 1990s opened up a whole new universe for intelligent machines to play in. A new word, softbot, was coined to describe software “robots” that operate entirely in a software environment such as the Web. Softbots, or bots as they later became known, perceive Web pages and act by emitting sequences of characters, URLs, and so on.

在互联网泡沫时期(1997-2000 年),人工智能公司如雨后春笋般涌现,为搜索和电子商务提供核心功能,包括链接分析、推荐系统、声誉系统、比较购物和产品分类。

AI companies mushroomed during the dot-com boom (1997–2000), providing core capabilities for search and e-commerce, including link analysis, recommendation systems, reputation systems, comparison shopping, and product categorization.

21 世纪初,配备麦克风、摄像头、加速度计和 GPS 的手机的广泛普及为人工智能系统进入人们的日常生活提供了新的途径;“智能音箱”如Amazon Echo、Google Home 和 Apple HomePod 已完成这一过程。

In the early 2000s, the widespread adoption of mobile phones with microphones, cameras, accelerometers, and GPS provided new access for AI systems to people’s daily lives; “smart speakers” such as the Amazon Echo, Google Home, and Apple HomePod have completed this process.

到 2008 年左右,连接到互联网的物体数量超过了连接到互联网的人数——有些人将这一转变称为物联网 (IoT) 的开始。这些物体包括汽车、家用电器、交通信号灯、自动售货机、恒温器、四轴飞行器、相机、环境传感器、机器人以及制造过程中以及分销和零售系统中的各种物质商品。这为人工智能系统提供了对现实世界的更强的感知和控制能力。

By around 2008, the number of objects connected to the Internet exceeded the number of people connected to the Internet—a transition that some point to as the beginning of the Internet of Things (IoT). Those things include cars, home appliances, traffic lights, vending machines, thermostats, quadcopters, cameras, environmental sensors, robots, and all kinds of material goods both in the manufacturing process and in the distribution and retail system. This provides AI systems with far greater sensory and control access to the real world.

最后,感知能力的提升使得人工智能机器人能够走出工厂(在工厂中它们依赖于严格约束的物体排列),进入真实的、非结构化的、混乱的世界,在这个世界里它们的摄像头可以捕捉到一些有趣的东西。

Finally, improvements in perception have allowed AI-powered robots to move out of the factory, where they relied on rigidly constrained arrangements of objects, and into the real, unstructured, messy world, where their cameras have something interesting to look at.

自动驾驶汽车

Self-driving cars

20 世纪 50 年代末,约翰·麦卡锡 (John McCarthy) 想象有一天自动驾驶汽车可以载他去机场。1987 年,恩斯特·迪克曼斯 (Ernst Dickmanns) 在德国高速公路上展示了一辆自动驾驶的梅赛德斯面包车;它能够保持车道、跟随另一辆车、变道和超车。3三十多年过去了,我们仍然没有完全自动驾驶的汽车,但它已经越来越近了。开发的重点早已从学术研究实验室转移到大公司。截至 2019 年,表现最佳的测试车辆已在公共道路上行驶了数百万英里(在驾驶模拟器中行驶了数十亿英里),没有发生严重事故。4不幸的是,其他自动驾驶和半自动驾驶汽车已经造成数人死亡。5

In the late 1950s, John McCarthy imagined that an automated vehicle might one day take him to the airport. In 1987, Ernst Dickmanns demonstrated a self-driving Mercedes van on the autobahn in Germany; it was capable of staying in lane, following another car, changing lanes, and overtaking.3 More than thirty years later, we still don’t have a fully autonomous car, but it’s getting much closer. The focus of development has long since moved from academic research labs to large corporations. As of 2019, the best-performing test vehicles have logged millions of miles of driving on public roads (and billions of miles in driving simulators) without serious incident.4 Unfortunately, other autonomous and semi-autonomous vehicles have killed several people.5

为何花了这么长时间才实现安全的自动驾驶?首先是因为性能要求很高。在美国,人类司机每行驶一亿英里,就会遭遇一次致命事故,这个标准太高了。自动驾驶汽车要想被接受,必须比这个标准高得多:大概是每十亿英里才会发生一次致命事故,或者每周驾驶四十小时,每年才会发生两万五千年的致命事故。第二个原因是,一种预期的变通方法——当汽车出现故障或超出安全运行条件时将控制权交给人类——根本行不通。当汽车自动驾驶时,人类很快就会脱离当前的驾驶环境,无法足够快地重新掌握情况以安全接管汽车。此外,如果出现问题,后座上的非驾驶员和出租车乘客无法驾驶汽车。

Why has it taken so long to achieve safe autonomous driving? The first reason is that the performance requirements are exacting. Human drivers in the United States suffer roughly one fatal accident per one hundred million miles traveled, which sets a high bar. Autonomous vehicles, to be accepted, will need to be much better than that: perhaps one fatal accident per billion miles, or twenty-five thousand years of driving forty hours per week. The second reason is that one anticipated workaround—handing control to the human when the vehicle is confused or out of its safe operating conditions—simply doesn’t work. When the car is driving itself, humans quickly become disengaged from the immediate driving circumstances and cannot regain context quickly enough to take over safely. Moreover, nondrivers and taxi passengers who are in the back seat are in no position to drive the car if something goes wrong.

当前项目的目标是实现 SAE 4 级自动驾驶,6这意味着车辆必须始终能够根据地理限制和天气条件自动驾驶或安全停车。由于天气和交通状况会发生变化,并且可能会出现 4 级车辆无法应对的异常情况,因此必须有人坐在车内,随时准备接管车辆。(5 级 - 无限制自动驾驶 - 不需要人类驾驶员,但实现起来更加困难。)4 级自动驾驶远远超出了沿着白线行驶和避开障碍物的简单反射任务。车辆必须根据当前和过去的观察,评估所有相关物体(包括可能不可见的物体)的意图和可能的未来轨迹。然后,车辆必须使用前瞻搜索找到一条能够优化安全性和进度组合的轨迹。一些项目正在尝试基于强化学习(当然主要是在模拟中)和从数百名人类驾驶员的记录中进行监督学习的更直接的方法,但这些方法似乎不太可能达到所需的安全级别。

Current projects are aiming at SAE Level 4 autonomy,6 which means that the vehicle must at all times be capable of driving autonomously or stopping safely, subject to geographical limits and weather conditions. Because weather and traffic conditions can change, and because unusual circumstances can arise that a Level 4 vehicle cannot handle, a human has to be in the vehicle and ready to take over if needed. (Level 5—unrestricted autonomy—does not require a human driver but is even more difficult to achieve.) Level 4 autonomy goes far beyond the simple, reflex tasks of following white lines and avoiding obstacles. The vehicle has to assess the intent and probable future trajectories of all relevant objects, including objects that may not be visible, based on both current and past observations. Then, using lookahead search, the vehicle has to find a trajectory that optimizes some combination of safety and progress. Some projects are trying more direct approaches based on reinforcement learning (mainly in simulation, of course) and supervised learning from recordings of hundreds of human drivers, but these approaches seem unlikely to reach the required level of safety.

完全自动驾驶汽车的潜在好处是巨大的。每年,全球有 120 万人死于车祸,数千万人受重伤。自动驾驶汽车的合理目标是车辆数量减少十倍。一些分析还预测交通成本、停车场、拥堵和污染将大幅减少。城市将从私家车和大型公交车转向无处不在的共享乘车、自动驾驶电动汽车,提供门到门服务,并为枢纽之间的高速公共交通连接提供动力。7由于成本低至每乘客英里 3 美分,大多数城市可能会选择免费提供服务——同时让乘客接受无休止的广告轰炸。

The potential benefits of fully autonomous vehicles are immense. Every year, 1.2 million people die in car accidents worldwide and tens of millions suffer serious injuries. A reasonable target for autonomous vehicles would be to reduce these numbers by a factor of ten. Some analyses also predict a vast reduction in transportation costs, parking structures, congestion, and pollution. Cities will shift from personal cars and large buses to ubiquitous shared-ride, autonomous electric vehicles, providing door-to-door service and feeding high-speed mass-transit connections between hubs.7 With costs as low as three cents per passenger mile, most cities would probably opt to provide the service for free—while subjecting riders to interminable barrages of advertising.

当然,要获得所有这些好处,该行业必须注意风险。如果太多死亡事件归咎于设计不良的实验车辆,监管机构可能会停止计划中的部署,或实施几十年内都无法达到的极其严格的标准。8当然,人们可能会决定不购买或乘坐自动驾驶汽车,除非它们确实安全。2018 年的一项民意调查显示,与 2016 年相比,消费者对自动驾驶汽车技术的信任度大幅下降。9即使该技术取得成功,向广泛自动驾驶的过渡也将是一个尴尬的过程:人类的驾驶技能可能会退化或消失,而自己开车这种鲁莽和反社会的行为可能会被彻底禁止。

Of course, to reap all these benefits, the industry has to pay attention to the risks. If there are too many deaths attributed to poorly designed experimental vehicles, regulators may halt planned deployments or impose extremely stringent standards that might be unreachable for decades.8 And people might, of course, decide not to buy or ride in autonomous vehicles unless they are demonstrably safe. A 2018 poll revealed a significant decline in consumers’ level of trust in autonomous vehicle technology compared to 2016.9 Even if the technology is successful, the transition to widespread autonomy will be an awkward one: human driving skills may atrophy or disappear, and the reckless and antisocial act of driving a car oneself may be banned altogether.

智能个人助理

Intelligent personal assistants

现在,大多数读者都体验过非智能个人助理:服从电视上听到的购买命令的智能扬声器,或者对“给我叫救护车!”做出回应的手机聊天机器人,“好的,从现在开始我会叫你‘安·安救护车’。”这类系统本质上是应用程序和搜索引擎的语音介导界面;它们主要基于预设的刺激-反应模板,这种方法可以追溯到 20 世纪 60 年代中期的 Eliza 系统。10

Most readers will by now have experienced the unintelligent personal assistant: the smart speaker that obeys purchase commands overheard on the television, or the cell phone chatbot that responds to “Call me an ambulance!” with “OK, from now on I’ll call you ‘Ann Ambulance.’” Such systems are essentially voice-mediated interfaces to applications and search engines; they are based largely on canned stimulus–response templates, an approach that dates back to the Eliza system in the mid-1960s.10

这些早期系统有三种缺点:访问、内容和情境。访问缺陷意味着他们缺乏感知正在发生的事情的能力——例如,他们可能能够听到用户在说什么,但却看不到用户在和谁说话。内容缺陷意味着他们根本无法理解用户所说或发的短信的含义,即使他们能够访问这些信息。情境缺陷意味着他们缺乏跟踪和推理构成日常生活的目标、活动和关系的能力。

These early systems have shortcomings of three kinds: access, content, and context. Access shortcomings mean that they lack sensory awareness of what’s going on—for example, they might be able to hear what the user is saying but they can’t see who the user is talking to. Content shortcomings mean that they simply fail to understand the meaning of what the user is saying or texting, even if they have access to it. Context shortcomings mean that they lack the ability to keep track of and reason about the goals, activities, and relationships that constitute daily life.

尽管存在这些缺点,智能音箱和手机助手为用户提供的价值足以进入数亿人的家庭和口袋。从某种意义上说,它们是人工智能的特洛伊木马。因为它们已经存在,嵌入到如此多的人的生活中,它们能力的每一个微小改进都价值数十亿美元。

Despite these shortcomings, smart speakers and cell phone assistants offer just enough value to the user to have entered the homes and pockets of hundreds of millions of people. They are, in a sense, Trojan horses for AI. Because they are there, embedded in so many lives, every tiny improvement in their capabilities is worth billions of dollars.

因此,改进正在不断涌现。其中最重要的可能是理解内容的基本能力——要知道“约翰在医院”不仅仅是一个提示说“我希望没什么大不了的”,而是包含实际信息,即用户的八岁儿子在附近的一家医院,可能患有严重的伤害或疾病。能够访问电子邮件和短信通信以及电话和家庭对话(通过家中的智能扬声器)将为人工智能系统提供足够的信息,以构建一个相当完整的用户生活图景——甚至可能比 19 世纪贵族家庭的管家或现代 CEO 的行政助理所能获得的信息还要多。

And so, improvements are coming thick and fast. Probably the most important is the elementary capacity to understand content—to know that “John’s in the hospital” is not just a prompt to say “I hope it’s nothing serious” but contains actual information that the user’s eight-year-old son is in a nearby hospital and may have a serious injury or illness. The ability to access email and text communications as well as phone calls and domestic conversations (through the smart speaker in the house) would give AI systems enough information to build a reasonably complete picture of the user’s life—perhaps even more information than might have been available to the butler working for a nineteenth-century aristocratic family or the executive assistant working for a modern-day CEO.

当然,仅有原始信息是不够的。要真正发挥作用,助手还需要掌握世界运作的常识:住院的孩子不会同时在家;手臂骨折的住院治疗很少会持续一两天以上;孩子的学校需要知道孩子预计会缺课;等等。这些知识使助手能够跟踪它没有直接观察到的事物——这是智能系统必不可少的技能。

Raw information, of course, is not enough. To be really useful, an assistant also needs commonsense knowledge of how the world works: that a child in the hospital is not simultaneously at home; that hospital care for a broken arm seldom lasts for more than a day or two; that the child’s school will need to know of the expected absence; and so on. Such knowledge allows the assistant to keep track of things it does not observe directly—an essential skill for intelligent systems.

我认为,上一段中描述的功能可以利用现有的概率推理技术实现,但这需要付出巨大的努力来构建构成我们日常生活的所有事件和交易的模型。到目前为止,由于成本高昂且收益不确定,这类常识建模项目通常尚未开展(除了可能在用于情报分析和军事规划的机密系统中)。然而,现在,这样的项目很容易就能覆盖数亿用户,因此投资风险较低,潜在回报要高得多。此外,接触大量用户使智能助手能够非常快速地学习并填补其知识中的所有空白。

The capabilities described in the preceding paragraph are, I believe, feasible with existing technology for probabilistic reasoning,C but this would require a very substantial effort to construct models of all the kinds of events and transactions that make up our daily lives. Up to now, these kinds of commonsense modeling projects have generally not been undertaken (except possibly in classified systems for intelligence analysis and military planning) because of the costs involved and the uncertain payoff. Now, however, projects like this could easily reach hundreds of millions of users, so the investment risks are lower and the potential rewards are much higher. Furthermore, access to large numbers of users allows the intelligent assistant to learn very quickly and fill in all the gaps in its knowledge.

因此,我们可以期待看到智能助手每月只需花费几分钱就能帮助用户管理越来越多的日常活动:日历、旅行、家庭购物、账单支付、儿童家庭作业、电子邮件和电话筛选、提醒、膳食计划,以及——只能梦想——找到我的钥匙。这些技能不会分散在多个应用程序中。相反,它们将成为单一、集成代理的各个方面,可以利用军方人员所说的通用作战图景中可用的协同作用。

Thus, one can expect to see intelligent assistants that will, for pennies a month, help users with managing an increasingly large range of daily activities: calendars, travel, household purchases, bill payment, children’s homework, email and call screening, reminders, meal planning, and—one can but dream—finding my keys. These skills will not be scattered across multiple apps. Instead, they will be facets of a single, integrated agent that can take advantage of the synergies available in what military people call the common operational picture.

智能助手的一般设计模板包括有关人类活动的背景知识、从感知和文本数据流中提取信息的能力,以及使助手适应用户特定情况的学习过程。同样的一般模板可以应用于至少三个其他主要领域:健康、教育和财务。对于这些应用,系统需要跟踪用户的身体、心理和银行账户(广义)的状态。与日常生活中的助手一样,在这三个领域中创建必要的一般知识的前期成本分摊到数十亿用户身上。

The general design template for an intelligent assistant involves background knowledge about human activities, the ability to extract information from streams of perceptual and textual data, and a learning process to adapt the assistant to the user’s particular circumstances. The same general template can be applied to at least three other major areas: health, education, and finances. For these applications, the system needs to keep track of the state of the user’s body, mind, and bank account (broadly construed). As with assistants for daily life, the up-front cost of creating the necessary general knowledge in each of these three areas amortizes across billions of users.

以健康为例,我们每个人的生理机能大致相同,而关于其运作方式的详细知识已经以机器可读的形式编码。11系统将适应您的个人特征和生活方式,提供预防建议和问题预警。

In the case of health, for example, we all have roughly the same physiology, and detailed knowledge of how it works has already been encoded in machine-readable form.11 Systems will adapt to your individual characteristics and lifestyle, providing preventive suggestions and early warning of problems.

在教育领域,早在 20 世纪 60 年代,人们就已认识到智能辅导系统的前景,12但真正的进步却姗姗来迟。主要原因是内容和访问方面的缺陷:大多数辅导系统不理解他们想要教授的内容,也无法通过语音或文本与学生进行双向交流。(我想象自己用老挝语教授我听不懂的弦理论,我不会说老挝语。)语音识别方面的最新进展意味着自动化辅导终于可以与尚未完全识字的学生交流。此外,概率推理技术现在可以跟踪学生知道和不知道的内容13,并可以优化教学方式以最大限度地提高学习效果。全球学习 XPRIZE 竞赛于 2014 年启动,为“开源、可扩展的软件提供 1500 万美元,使发展中国家的儿童能够在 15 个月内自学基本的阅读、写作和算术。”从获奖者 Kitkit School 和 onebillion 的结果表明,这一目标已基本实现。

In the area of education, the promise of intelligent tutoring systems was recognized even in the 1960s,12 but real progress has been a long time coming. The primary reasons are shortcomings of content and access: most tutoring systems don’t understand the content of what they purport to teach, nor can they engage in two-way communication with their pupils through speech or text. (I imagine myself teaching string theory, which I don’t understand, in Laotian, which I don’t speak.) Recent progress in speech recognition means that automated tutors can, at last, communicate with pupils who are not yet fully literate. Moreover, probabilistic reasoning technology can now keep track of what students know and don’t know13 and can optimize the delivery of instruction to maximize learning. The Global Learning XPRIZE competition, which started in 2014, offered $15 million for “open-source, scalable software that will enable children in developing countries to teach themselves basic reading, writing and arithmetic within 15 months.” Results from the winners, Kitkit School and onebillion, suggest that the goal has largely been achieved.

在个人理财领域,系统将跟踪投资、收入流、强制性和可自由支配的支出、债务、利息支付、应急储备等,就像金融分析师跟踪公司的财务状况和前景一样。与处理日常生活的代理集成将提供更细致的理解,甚至可能确保孩子们的零花钱减去任何与恶作剧相关的扣除额。人们可以期待获得以前只有超级富豪才能获得的高质量日常理财建议。

In the area of personal finance, systems will keep track of investments, income streams, obligatory and discretionary expenditures, debt, interest payments, emergency reserves, and so on, in much the same way that financial analysts keep track of the finances and prospects of corporations. Integration with the agent that handles daily life will provide an even finer-grained understanding, perhaps even ensuring that the children get their pocket money minus any mischief-related deductions. One can expect to receive the quality of day-to-day financial advice previously reserved for the ultra-rich.

如果您在阅读上述段落时没有意识到隐私问题,则说明您没有及时了解新闻。然而,隐私问题涉及多个层面。首先,如果个人助理对您一无所知,它真的有用吗?可能没有用。其次,如果个人助理不能汇集来自多个用户的信息来更多地了解普通人和与你相似的人,那么它真的有用吗?可能没有。那么,这两件事难道不意味着我们必须放弃隐私才能在日常生活中受益于人工智能吗?不。原因是学习算法可以使用安全多方计算技术对加密数据进行操作,这样用户可以从汇集中受益,而不会以任何方式损害隐私。14软件提供商会在没有立法鼓励的情况下自愿采用隐私保护技术吗?这还有待观察。然而,似乎不可避免的是,只有当个人助理的主要义务是对用户而不是对生产它的公司时,用户才会信任它。

If your privacy alarm bells weren’t ringing as you read the preceding paragraphs, you haven’t been keeping up with the news. There are, however, multiple layers to the privacy story. First, can a personal assistant really be useful if it knows nothing about you? Probably not. Second, can personal assistants be really useful if they cannot pool information from multiple users to learn more about people in general and people who are similar to you? Probably not. So, don’t those two things imply that we have to give up our privacy to benefit from AI in our daily lives? No. The reason is that learning algorithms can operate on encrypted data using the techniques of secure multiparty computation, so that users can benefit from pooling without compromising privacy in any way.14 Will software providers adopt privacy-preserving technology voluntarily, without legislative encouragement? That remains to be seen. What seems inevitable, however, is that users will trust a personal assistant only if its primary obligation is to the user rather than to the corporation that produced it.

智能家居和家用机器人

Smart homes and domestic robots

智能家居概念已被研究了几十年。1966 年,西屋公司的工程师詹姆斯·萨瑟兰开始收集多余的计算机零件,以制造第一台智能家居控制器 ECHO。15不幸的是,ECHO 重达 800 磅,耗电量为 3.5 千瓦,只能管理三个数字时钟和电视天线。后续系统要求用户掌握复杂程度令人难以置信控制界面。毫不奇怪,它们从未流行起来。

The smart home concept has been investigated for several decades. In 1966, James Sutherland, an engineer at Westinghouse, started collecting surplus computer parts to build ECHO, the first smart-home controller.15 Unfortunately, ECHO weighed eight hundred pounds, consumed 3.5 kilowatts, and managed just three digital clocks and the TV antenna. Subsequent systems required users to master control interfaces of mind-boggling complexity. Unsurprisingly, they never caught on.

从 20 世纪 90 年代开始,一些雄心勃勃的项目试图设计出能够自我管理、尽量减少人为干预的房屋,利用机器学习来适应居住者的生活方式。为了使这些实验有意义,必须让真人住在这些房子里。不幸的是,频繁的错误决策让这些系统变得比无用更糟糕——居住者的生活质量不但没有提高,反而下降了。例如,华盛顿州立大学 2003 年的 MavHome 项目16的住户经常不得不坐在黑暗中,如果访客比平时晚睡。17就像不智能的个人助理一样,这种失败是由于感官能力不足造成的了解居住者的活动,无法理解和跟踪房子里发生的事情。

Beginning in the 1990s, several ambitious projects attempted to design houses that managed themselves with minimal human intervention, using machine learning to adapt to the lifestyles of the occupants. To make these experiments meaningful, real people had to live in the houses. Unfortunately, the frequency of erroneous decisions made the systems worse than useless—the occupants’ quality of life decreased rather than increased. For example, inhabitants of the 2003 MavHome project16 at Washington State University often had to sit in the dark if their visitors stayed later than the usual bedtime.17 As with the unintelligent personal assistant, such failings result from inadequate sensory access to the activities of the occupants and the inability to understand and keep track of what’s happening in the house.

真正智能的家居配备了摄像头和麦克风,以及必要的感知和推理能力,能够了解居住者在做什么:拜访、吃饭、睡觉、看电视、阅读、锻炼、准备长途旅行,或者摔倒后无助地躺在地板上。通过与智能个人助理的协调,家居可以很好地了解谁会在什么时候进出房屋,谁在哪里吃饭等等。这种理解使它能够管理供暖、照明、窗帘和安全系统,及时发送提醒,并在出现问题时向用户或紧急服务部门发出警报。美国和日本的一些新建公寓大楼已经采用了这种技术。18

A truly smart home equipped with cameras and microphones—and the requisite perceptual and reasoning abilities—can understand what the occupants are doing: visiting, eating, sleeping, watching TV, reading, exercising, getting ready for a long trip, or lying helpless on the floor after a fall. By coordinating with the intelligent personal assistant, the home can have a pretty good idea of who will be in or out of the house at what time, who’s eating where, and so on. This understanding allows it to manage heating, lighting, window blinds, and security systems, to send timely reminders, and to alert users or emergency services when a problem arises. Some newly built apartment complexes in the United States and Japan are already incorporating technology of this kind.18

智能家居的价值受到其执行器的限制:更简单的系统(定时恒温器、运动感应灯和防盗报警器)可以提供许多相同的功能,而且这些功能可能更可预测,但对环境的敏感性较低。智能家居无法折叠衣物、洗碗或拿起报纸。它确实需要一个物理机器人来执行它的命令。

The value of the smart home is limited because of its actuators: much simpler systems (timed thermostats and motion-sensitive lights and burglar alarms) can deliver a lot of the same functionality in ways that are perhaps more predictable, if less context sensitive. The smart home cannot fold the laundry, clear the dishes, or pick up the newspaper. It really wants a physical robot to do its bidding.

图 5:(左)BRETT 折叠毛巾;(右)波士顿动力公司的 SpotMini 机器人开门。

FIGURE 5: (left) BRETT folding towels; (right) the Boston Dynamics SpotMini robot opening a door.

也许等待的时间并不长。机器人已经展示了许多所需的技能。在我的同事 Pieter Abbeel 的伯克利实验室中,BRETT(伯克利消除繁琐任务机器人)自 2011 年以来一直在折叠成堆的毛巾,而波士顿动力公司的 SpotMini 机器人可以爬楼梯和开门(图 5)。一些公司已经在制造烹饪机器人,尽管它们需要特殊的封闭设置和预先切好的配料,并且无法在普通厨房中使用。19

It may not have too long to wait. Already, robots have demonstrated many of the required skills. In the Berkeley lab of my colleague Pieter Abbeel, BRETT (the Berkeley Robot for the Elimination of Tedious Tasks) has been folding piles of towels since 2011, while the SpotMini robot from Boston Dynamics can climb stairs and open doors (figure 5). Several companies are already building cooking robots, although they require special, enclosed setups and pre-cut ingredients and won’t work in an ordinary kitchen.19

在实用家用机器人所需的三种基本物理能力(感知、移动和灵巧性)中,后者是最成问题的。正如布朗大学机器人学教授 Stefanie Tellex 所说,“大多数机器人大多数时候都无法拾起大多数物体。”这部分是触觉感知的问题,部分是制造问题(灵巧的手目前制造成本非常高),部分是算法问题:我们还没有很好地理解如何将感知和控制结合起来,来抓取和操纵典型家庭中种类繁多的物体。仅针对刚性物体的抓取类型就有几十种,而不同的操纵技巧则有数千种,例如从瓶子里摇出两颗药丸、从果酱罐上撕下标签、在软面包上涂硬黄油,或者用叉子从锅里拿起一根意大利面条看看它是否煮好了。

Of the three basic physical capabilities required for a useful domestic robot—perception, mobility, and dexterity—the latter is most problematic. As Stefanie Tellex, a robotics professor at Brown University, puts it, “Most robots can’t pick up most objects most of the time.” This is partly a problem of tactile sensing, partly a manufacturing problem (dexterous hands are currently very expensive to build), and partly an algorithmic problem: we don’t yet have a good understanding of how to combine sensing and control to grasp and manipulate the huge variety of objects in a typical household. There are dozens of grasp types just for rigid objects and there are thousands of distinct manipulation skills, such as shaking exactly two pills out of a bottle, peeling the label off a jam jar, spreading hard butter on soft bread, or lifting one strand of spaghetti from the pot with a fork to see if it’s ready.

触觉感知和手部构造问题似乎可以通过 3D 打印技术解决,波士顿动力公司已经将 3D 打印技术用于其 Atlas 人形机器人的一些较复杂的部件。机器人操控技能正在迅速发展,这在一定程度上要归功于深度强化学习。20最后的推动力——将所有这些组合成开始接近电影机器人令人敬畏的身体技能的东西——很可能来自相当不浪漫的仓库行业。仅亚马逊一家公司就雇用了数十万名员工,他们从巨型仓库的货箱中挑选产品并将其分发给客户。从 2015 年到 2017 年,亚马逊每年都会举办“挑选挑战赛”,以加速开发能够完成这项任务的机器人。21还有一段路要走,但当核心研究问题得到解决时——可能在十年内——人们可以期待高性能机器人的快速推出。最初它们将在仓库工作,然后在农业和建筑等其他商业应用中工作,这些应用的任务和对象范围相当可预测。我们可能很快就会在零售业看到它们执行诸如上架超市货架和重新折叠衣服等任务。

It seems likely that the tactile sensing and hand construction problems will be solved by 3D printing, which is already being used by Boston Dynamics for some of the more complex parts of their Atlas humanoid robot. Robot manipulation skills are advancing rapidly, thanks in part to deep reinforcement learning.20 The final push—putting all this together into something that begins to approximate the awesome physical skills of movie robots—is likely to come from the rather unromantic warehouse industry. Just one company, Amazon, employs several hundred thousand people who pick products out of bins in giant warehouses and dispatch them to customers. From 2015 through 2017 Amazon ran an annual “Picking Challenge” to accelerate the development of robots capable of doing this task.21 There is still some distance to go, but when the core research problems are solved—probably within a decade—one can expect a very rapid rollout of highly capable robots. Initially they will work in warehouses, then in other commercial applications such as agriculture and construction, where the range of tasks and objects is fairly predictable. We might also see them quite soon in the retail sector doing tasks such as stocking supermarket shelves and refolding clothes.

第一个真正受益于家用机器人的人将是老年人和体弱者,对他们来说,一个有用的机器人可以提供一定程度的独立性,而这在以前是不可能的。即使机器人的任务能力有限,对正在发生的事情只有基本的理解,它仍然非常有用。另一方面,机器人管家能够泰然自若地管理家庭,预测主人的每一个愿望,这还有一段距离——它需要某种接近人类水平的人工智能。

The first to really benefit from robots in the home will be the elderly and infirm, for whom a helpful robot can provide a degree of independence that would otherwise be impossible. Even if the robot has a limited repertoire of tasks and only rudimentary comprehension of what’s going on, it can still be very useful. On the other hand, the robot butler, managing the household with aplomb and anticipating its master’s every wish, is still some way off—it requires something approaching the generality of human-level AI.

全球范围内的情报

Intelligence on a global scale

理解语音和文本的基本能力的发展将使智能个人助理能够做人类助理已经可以做的事情(但他们每月只需花费几分钱,而不是数千美元)。基本的语音和文本理解能力还使机器能够做人类无法做到的事情——不是因为理解的深度,而是因为理解的规模例如,一台具备基本阅读能力的机器将能够在午餐时间之前阅读人类曾经写过的所有内容,然后它会四处寻找其他事情来做。22有了语音识别能力,它可以在下午茶时间之前收听每一个广播和电视节目。相比之下,需要二十万名全职人类才能跟上世界目前的印刷出版物水平(更不用说所有的书面出版物了)。有六万人用来收听当前的广播。23

The development of basic capabilities for understanding speech and text will allow intelligent personal assistants to do things that human assistants can already do (but they will be doing it for pennies per month instead of thousands of dollars per month). Basic speech and text understanding also enable machines to do things that no human can do—not because of the depth of understanding but because of its scale. For example, a machine with basic reading capabilities will be able to read everything the human race has ever written by lunchtime, and then it will be looking around for something else to do.22 With speech recognition capabilities, it could listen to every radio and television broadcast before teatime. For comparison, it would take two hundred thousand full-time humans just to keep up with the world’s current level of print publication (let alone all the written material from the past) and another sixty thousand to listen to current broadcasts.23

如果这样的系统能够提取哪怕是简单的事实断言,并将所有语言的所有信息整合起来,那么它将成为解答问题和揭示模式的宝贵资源——可能比目前价值约 1 万亿美元的搜索引擎强大得多。它对历史和社会学等领域的研究价值将是不可估量的。

Such a system, if it could extract even simple factual assertions and integrate all this information across all languages, would represent an incredible resource for answering questions and revealing patterns—probably far more powerful than search engines, which are currently valued at around $1 trillion. Its research value for fields such as history and sociology would be inestimable.

当然,监听全世界的电话也是可能的(这项工作需要大约 2000 万人参与)。某些秘密机构会发现这很有价值。其中一些机构多年来一直在进行简单的大规模机器监听,例如识别对话中的关键词,现在已经过渡到将整个对话转录为可搜索的文本。24转录当然很有用,但远不如同时理解和整合所有对话的内容有用。

Of course, it would also be possible to listen to all the world’s phone calls (a job that would require about twenty million people). There are certain clandestine agencies that would find this valuable. Some of them have been doing simple kinds of large-scale machine listening, such as spotting key words in conversations, for many years, and have now made the transition to transcribing entire conversations into searchable text.24 Transcriptions are certainly useful, but not nearly as useful as simultaneous understanding and content integration of all conversations.

机器的另一项“超能力”是可以一次性看到整个世界。粗略地说,卫星每天都会以每像素五十厘米的平均分辨率对整个世界进行成像。在这种分辨率下,地球上的每栋房屋、轮船、汽车、牛和树都清晰可见。检查所有这些图像需要超过三千万名全职员工;25因此,目前人类根本看不到绝大多数卫星数据。计算机视觉算法可以处理所有这些数据,以生成一个可搜索的全球数据库,每天更新,以及经济活动、植被变化、动物和人类迁徙、气候变化的影响等的可视化和预测模型。Planet 和 DigitalGlobe 等卫星公司正忙于将这个想法变为现实。

Another “superpower” that is available to machines is to see the entire world at once. Roughly speaking, satellites image the entire world every day at an average resolution of around fifty centimeters per pixel. At this resolution, every house, ship, car, cow, and tree on Earth is visible. Well over thirty million full-time employees would be needed to examine all these images;25 so, at present, no human ever sees the vast majority of satellite data. Computer vision algorithms could process all this data to produce a searchable database of the whole world, updated daily, as well as visualizations and predictive models of economic activities, changes in vegetation, migrations of animals and people, the effects of climate change, and so on. Satellite companies such as Planet and DigitalGlobe are busy making this idea a reality.

有了全球感知的可能性,就有了全球决策的可能性。例如,从全球卫星数据反馈中,应该可以创建详细的模型用于管理全球环境、预测环境和经济干预的影响以及为联合国的可持续发展目标提供必要的分析投入。26我们已经看到“智慧城市”控制系统,旨在优化交通管理、公交、垃圾收集、道路维修、环境维护和其他功能,以造福公民,这些系统可能会扩展到国家层面。直到最近,这种程度的协调才能通过庞大、低效、官僚主义的人类等级制度来实现;不可避免的是,这些将被超级代理所取代,这些代理会负责我们集体生活中越来越多的方面。当然,随之而来的是全球范围内侵犯隐私和社会控制的可能性,我将在下一章中继续讨论这个问题。

With the possibility of sensing on a global scale comes the possibility of decision making on a global scale. For example, from global satellite data feeds, it should be possible to create detailed models for managing the global environment, predicting the effects of environmental and economic interventions, and providing the necessary analytical inputs to the UN’s sustainable development goals.26 We are already seeing “smart city” control systems that aim to optimize traffic management, transit, trash collection, road repairs, environmental maintenance, and other functions for the benefit of citizens, and these may be extended to the country level. Until recently, this degree of coordination could be achieved only by huge, inefficient, bureaucratic hierarchies of humans; inevitably, these will be replaced by mega-agents that take care of more and more aspects of our collective lives. Along with this, of course, comes the possibility of privacy invasion and social control on a global scale, to which I return in the next chapter.

超级智能AI何时到来?

When Will Superintelligent AI Arrive?

经常有人问我,超级智能何时会到来,我通常拒绝回答。原因有三。首先,这种预测长期以来都是错误的。27例如,1960 年,人工智能先驱、诺贝尔经济学奖得主赫伯特·西蒙写道:“从技术上讲……机器在二十年内将能够做任何人类能做的工作。” 28 1967,开启人工智能领域的 1956 年达特茅斯研讨会的联合组织者马文·明斯基写道:“我确信,在一代人的时间里,几乎没有什么智力领域会留在机器领域之外——创造‘人工智能’的问题将得到实质性解决。” 29

I am often asked to predict when superintelligent AI will arrive, and I usually refuse to answer. There are three reasons for this. First, there is a long history of such predictions going wrong.27 For example, in 1960, the AI pioneer and Nobel Prize–winning economist Herbert Simon wrote, “Technologically . . . machines will be capable, within twenty years, of doing any work a man can do.”28 In 1967, Marvin Minsky, a co-organizer of the 1956 Dartmouth workshop that started the field of AI, wrote, “Within a generation, I am convinced, few compartments of intellect will remain outside the machine’s realm—the problem of creating ‘artificial intelligence’ will be substantially solved.”29

拒绝为超级人工智能确定日期的第二个原因是,没有明确的门槛可以跨越。机器在某些领域已经超越了人类的能力。这些领域将会扩大和深化,在我们拥有完全通用的人工智能之前,很可能会出现超人的通用知识系统、超人的生物医学研究系统、超人的灵巧敏捷机器人、超人的公司规划系统等等。超级智能 AI 系统。这些“部分超级智能”系统,无论是个体还是集体,都将开始面临一般智能系统所面临的许多相同问题。

A second reason for declining to provide a date for superintelligent AI is that there is no clear threshold that will be crossed. Machines already exceed human capabilities in some areas. Those areas will broaden and deepen, and it is likely that there will be superhuman general knowledge systems, superhuman biomedical research systems, superhuman dexterous and agile robots, superhuman corporate planning systems, and so on well before we have a completely general superintelligent AI system. These “partially superintelligent” systems will, individually and collectively, begin to pose many of the same issues that a generally intelligent system would.

第三个不预测超级人工智能到来的原因是,它本身就不可预测。正如约翰·麦卡锡在 1977 年的一次采访中指出的那样,它需要“概念上的突破”。30麦卡锡接着说:“你想要的是 1.7 个爱因斯坦和 0.3 个曼哈顿计划,而且你要先得到爱因斯坦。我相信这需要 5 到 500 年。”在下一节中,我将解释一些概念上的突破可能是什么。它们到底有多不可预测?可能就像卢瑟福宣布核链式反应完全不可能几个小时后西拉德发明核链式反应一样不可预测。

A third reason for not predicting the arrival of superintelligent AI is that it is inherently unpredictable. It requires “conceptual breakthroughs,” as noted by John McCarthy in a 1977 interview.30 McCarthy went on to say, “What you want is 1.7 Einsteins and 0.3 of the Manhattan Project, and you want the Einsteins first. I believe it’ll take five to 500 years.” In the next section I’ll explain what some of the conceptual breakthroughs are likely to be. Just how unpredictable are they? Probably as unpredictable as Szilard’s invention of the nuclear chain reaction a few hours after Rutherford’s declaration that it was completely impossible.

有一次,在 2015 年世界经济论坛的一次会议上,我回答了我们何时能看到超级人工智能的问题。会议遵循了查塔姆宫规则,这意味着任何与会人员的言论都不能被归为会议成员。即便如此,出于谨慎,我在回答开头说:“严格保密……”我建议,除非发生灾难,否则超级人工智能很可能在我孩子的有生之年出现——他们还很小,而且由于医学的进步,他们的寿命可能比会议上的许多人长得多。不到两个小时后,《每日电讯报》刊登了一篇文章,引用了拉塞尔教授的言论,并附上了横冲直撞的终结者机器人的图片。标题是“反社会”机器人可能在一代人的时间内取代人类

Once, at a meeting of the World Economic Forum in 2015, I answered the question of when we might see superintelligent AI. The meeting was under Chatham House rules, which means that no remarks may be attributed to anyone present at the meeting. Even so, out of an excess of caution, I prefaced my answer with “Strictly off the record. . . .” I suggested that, barring intervening catastrophes, it would probably happen in the lifetime of my children—who were still quite young and would probably have much longer lives, thanks to advances in medical science, than many of those at the meeting. Less than two hours later, an article appeared in the Daily Telegraph citing Professor Russell’s remarks, complete with images of rampaging Terminator robots. The headline was ‘SOCIOPATHIC’ ROBOTS COULD OVERRUN THE HUMAN RACE WITHIN A GENERATION.

我的时间表(比如说八十年)比典型的人工智能研究人员的时间表要保守得多。最近的调查31表明,大多数活跃的研究人员预计人类级别的人工智能将在本世纪中叶左右出现。我们在核物理方面的经验表明,明智的做法是假设进展可能很快发生,并做好相应的准备。如果只需要一个概念上的突破,类似于西拉德的中子诱导核链式反应,某种形式的超级人工智能可能会突然出现。我们很可能没有做好准备:如果我们制造出具有一定程度自主性的超级智能机器,我们很快就会发现自己无法控制它们。然而,我相当有信心我们还有一些喘息的空间,因为从这里到超级智能之间需要几个重大突破,而不仅仅是一个。

My timeline of, say, eighty years is considerably more conservative than that of the typical AI researcher. Recent surveys31 suggest that most active researchers expect human-level AI to arrive around the middle of this century. Our experience with nuclear physics suggests that it would be prudent to assume that progress could occur quite quickly and to prepare accordingly. If just one conceptual breakthrough were needed, analogous to Szilard’s idea for a neutron-induced nuclear chain reaction, superintelligent AI in some form could arrive quite suddenly. The chances are that we would be unprepared: if we built superintelligent machines with any degree of autonomy, we would soon find ourselves unable to control them. I am, however, fairly confident that we have some breathing space because there are several major breakthroughs needed between here and superintelligence, not just one.

即将出现的概念突破

Conceptual Breakthroughs to Come

创造通用的、人类水平的人工智能的问题还远未解决。解决这个问题并不是花钱雇佣更多工程师、更多数据和更强大的计算机的问题。一些未来学家制作了图表,根据摩尔定律推断未来计算能力的指数增长,显示机器将变得比昆虫大脑、老鼠大脑、人类大脑、所有人类大脑加起来还要强大的日期。32这些图表毫无意义,因为正如我已经说过的,更快的机器只会更快地给你错误的答案。如果有人把人工智能的顶尖专家聚集到一个拥有无限资源的团队中,目标是通过结合我们所有最好的想法来创建一个集成的、人类水平的智能系统,结果将是失败。这个系统会在现实世界中崩溃。它不会理解发生了什么;它无法预测其行为的后果;它不会理解人们在任何特定情况下想要什么;所以它会做出非常愚蠢的事情。

The problem of creating general-purpose, human-level AI is far from solved. Solving it is not a matter of spending money on more engineers, more data, and bigger computers. Some futurists produce charts that extrapolate the exponential growth of computing power into the future based on Moore’s law, showing the dates when machines will become more powerful than insect brains, mouse brains, human brains, all human brains put together, and so on.32 These charts are meaningless because, as I have already said, faster machines just give you the wrong answer more quickly. If one were to collect AI’s leading experts into a single team with unlimited resources, with the goal of creating an integrated, human-level intelligent system by combining all our best ideas, the result would be failure. The system would break in the real world. It wouldn’t understand what was going on; it wouldn’t be able to predict the consequences of its actions; it wouldn’t understand what people want in any given situation; and so it would do ridiculously stupid things.

通过了解系统崩溃的原因,人工智能研究人员能够确定必须解决的问题——需要哪些概念上的突破——才能达到人类水平的人工智能。我现在将描述其中一些遗留问题。一旦这些问题得到解决,可能会有更多问题,但不会太多。

By understanding how the system would break, AI researchers are able to identify the problems that have to be solved—the conceptual breakthroughs that are needed—in order to reach human-level AI. I will now describe some of these remaining problems. Once they are solved, there may be more, but not very many more.

语言和常识

Language and common sense

没有知识的智慧就像没有燃料的发动机。人类从其他人那里获得了大量的知识:这些知识以语言的形式代代相传。其中一些是事实:奥巴马于 2009 年成为总统,铜的密度为每立方厘米 8.92 克,乌尔纳姆法典规定了对各种罪行的惩罚,等等。大量的知识存在于语言本身中 — 存在于语言提供的概念中。总统2009密度厘米犯罪等等都带有大量的信息,这些信息代表了发现和组织过程的提取精华,正是这些精华让它们首先出现在语言中。

Intelligence without knowledge is like an engine without fuel. Humans acquire a vast amount of knowledge from other humans: it is passed down through generations in the form of language. Some of it is factual: Obama became president in 2009, the density of copper is 8.92 grams per cubic centimeter, the code of Ur-Nammu set out punishments for various crimes, and so on. A great deal of knowledge resides in the language itself—in the concepts that it makes available. President, 2009, density, copper, gram, centimeter, crime, and the rest all carry with them a vast amount of information, which represents the extracted essence of the processes of discovery and organization that led them to be in the language in the first place.

以铜为例,它指的是宇宙中的一些原子集合,并将其与arglebarglium进行比较,这是我对宇宙中同样庞大的完全随机选择的原子集合的称呼。人们可以发现许多关于铜的普遍、有用和可预测的定律——关于它的密度、导电性、延展性、熔点、恒星起源、化学化合物、实际用途等等;相比之下,关于 arglebarglium 基本上没有什么可以说的。一个拥有由 arglebarglium 这样的词语组成的语言的生物将无法正常运作,因为它永远无法发现允许它模拟和预测宇宙的规律。

Take, for example, copper, which refers to some collection of atoms in the universe, and compare it to arglebarglium, which is my name for an equally large collection of entirely randomly selected atoms in the universe. There are many general, useful, and predictive laws one can discover about copper—about its density, conductivity, malleability, melting point, stellar origin, chemical compounds, practical uses, and so on; in comparison, there is essentially nothing that can be said about arglebarglium. An organism equipped with a language composed of words like arglebarglium would be unable to function, because it would never discover the regularities that would allow it to model and predict its universe.

一台真正理解人类语言的机器将能够快速获取大量人类知识,从而绕过地球上超过一千亿人数万年的学习过程。期望机器从头开始、从原始感官数据开始重新发现所有这些知识似乎根本不切实际。

A machine that really understands human language would be in a position to quickly acquire vast quantities of human knowledge, allowing it to bypass tens of thousands of years of learning by the more than one hundred billion people who have lived on Earth. It seems simply impractical to expect a machine to rediscover all this from scratch, starting from raw sensory data.

然而,目前自然语言技术还无法完成阅读和理解数百万本书的任务——其中许多即使是受过良好教育的人也会被难倒。IBM 的 Watson 等系统曾在 2011 年的Jeopardy!智力竞赛中击败了两位人类冠军,它们可以从明确陈述的事实中提取简单信息,但无法从文本中构建复杂的知识结构;也无法回答需要使用来自多个来源的信息进行大量推理的问题。例如,阅读截至 1973 年底的所有可用文件并评估(并解释)当时总统尼克松的水门事件弹劾程序的可能结果,这项任务将远远超出目前最先进的水平。

At present, however, natural language technology is not up to the task of reading and understanding millions of books—many of which would stump even a well-educated human. Systems such as IBM’s Watson, which famously defeated two human champions of the Jeopardy! quiz game in 2011, can extract simple information from clearly stated facts but cannot build complex knowledge structures from text; nor can they answer questions that require extensive chains of reasoning with information from multiple sources. For example, the task of reading all available documents up to the end of 1973 and assessing (with explanations) the probable outcome of the Watergate impeachment process against then president Nixon would be well beyond the current state of the art.

目前正在努力深化语言分析和信息提取的水平。例如,艾伦人工智能研究所的 Aristo 项目旨在构建能够在阅读教科书和学习指南后通过学校科学考试的系统。33是四年级考试的一道题:34

There are serious efforts underway to deepen the level of language analysis and information extraction. For example, Project Aristo at the Allen Institute for AI aims to build systems that can pass school science exams after reading textbooks and study guides.33 Here’s a question from a fourth-grade test:34

四年级学生正在计划一场轮滑比赛。哪种场地最适合举办这种比赛?

Fourth graders are planning a roller-skate race. Which surface would be the best for this race?

(A)砾石

(A) gravel

(B)沙子

(B) sand

(C)柏油路

(C) blacktop

(四)草

(D) grass

机器在回答这个问题时至少面临两个困难。第一个是经典的语言理解问题,即弄清句子的意思:分析句法结构、识别单词的含义等等。(您可以自己尝试一下:使用在线翻译服务将句子翻译成不熟悉的语言,然后使用该语言的词典尝试将其翻译回英语。)第二个是需要常识性知识:要弄清楚“轮滑比赛”可能是穿着轮滑鞋(脚上)的人之间的比赛,而不是轮滑鞋之间的比赛;要理解“表面”是滑冰者滑行的地方,而不是观众坐的地方;要知道“最佳”在比赛表面的语境中是什么意思;等等。想想如果我们把“四年级学生”换成“虐待狂军队训练营教练”,答案会有什么变化。

A machine faces at least two sources of difficulty in answering this question. The first is the classical language-understanding problem of working out what the sentences say: analyzing the syntactic structure, identifying the meanings of words, and so on. (Try this for yourself: use an online translation service to translate the sentences into an unfamiliar language, then use a dictionary for that language to try translating them back to English.) The second is the need for commonsense knowledge: to work out that a “roller-skate race” is probably a race between people wearing roller skates (on their feet) rather than a race between roller skates, to understand that the “surface” is what the skaters will skate on rather than what the spectators will sit on, to know what “best” means in the context of a surface for a race, and so on. Think how the answer might change if we replaced “fourth graders” with “sadistic army boot-camp trainers.”

总结这一困难的一种方法是说阅读需要知识,而知识(很大程度上)来自阅读。换句话说,我们面临着一个典型的先有鸡还是先有蛋的局面。我们可能希望有一个引导过程,即系统阅读一些简单的文本,获得一些知识,利用这些知识阅读更难的文本,再获得更多的知识,等等。不幸的是,情况往往恰恰相反:获得的知识大多是错误的,这会导致阅读错误,从而导致更多错误的知识,等等。

One way to summarize the difficulty is to say that reading requires knowledge and knowledge (largely) comes from reading. In other words, we face a classic chicken-and-egg situation. We might hope for a bootstrapping process, whereby the system reads some easy text, acquires some knowledge, uses that to read more difficult text, acquires still more knowledge, and so on. Unfortunately, what tends to happen is the opposite: the knowledge acquired is mostly erroneous, which causes errors in reading, which results in more erroneous knowledge, and so on.

例如,卡内基梅隆大学的 NELL(永无止境的语言学习)项目可能是目前正在进行的最雄心勃勃的语言引导项目。从 2010 年到 2018 年,NELL 通过阅读网络上的英文文本获得了超过 1.2 亿个信念。35其中一些信念是准确的,例如枫叶队打曲棍球并赢得了斯坦利杯的信念。除了事实之外,NELL 还一直在获取新的词汇、类别和语义关系。不幸的是,NELL 仅对其 3% 的信念有信心,并且依靠人类专家定期清除错误或毫无意义的信念 - 例如它的信念“尼泊尔是一个也称为美国的国家”和“价值是一种通常被切成基础的农产品”。

For example, the NELL (Never-Ending Language Learning) project at Carnegie Mellon University is probably the most ambitious language-bootstrapping project currently underway. From 2010 to 2018, NELL acquired over 120 million beliefs by reading English text on the Web.35 Some of these beliefs are accurate, such as the beliefs that the Maple Leafs play hockey and won the Stanley Cup. In addition to facts, NELL acquires new vocabulary, categories, and semantic relationships all the time. Unfortunately, NELL has confidence in only 3 percent of its beliefs and relies on human experts to clean out false or meaningless beliefs on a regular basis—such as its beliefs that “Nepal is a country also known as United States” and “value is an agricultural product that is usually cut into basis.”

我怀疑,可能没有单一的突破可以将下行螺旋转变为上行螺旋。基本的引导过程似乎是正确的:一个知道足够多事实的程序可以弄清楚一个新句子指的是哪个事实,从而学习一种表达事实的新文本形式——然后让它发现更多的事实,这个过程继续下去。(谷歌联合创始人谢尔盖·布林于 1998 年发表了一篇关于引导思想的重要论文。36 通过提供大量手动编码的知识和语言信息来启动泵肯定会有所帮助。提高事实表述的复杂程度(允许复杂事件、因果关系、他人的信仰和态度等)并提高对词义和句子意义的不确定性的处理能力,最终可能导致自我强化而不是自我消灭的学习过程。

I suspect that there may be no single breakthrough that turns the downward spiral into an upward spiral. The basic bootstrapping process seems right: a program that knows enough facts can figure out which fact a novel sentence is referring to, and thereby learns a new textual form for expressing facts—which then lets it discover more facts, and so the process continues. (Sergey Brin, the co-founder of Google, published an important paper on the bootstrapping idea in 1998.36) Priming the pump by supplying a good deal of manually encoded knowledge and linguistic information would certainly help. Increasing the sophistication of the representation of facts—allowing for complex events, causal relationships, beliefs and attitudes of others, and so on—and improving the handling of uncertainty about word meanings and sentence meanings may eventually result in a self-reinforcing rather than self-extinguishing process of learning.

概念和理论的累积学习

Cumulative learning of concepts and theories

大约 14 亿年前,在距离地球 8.2 十亿亿英里的地方,两个黑洞(一个是地球质量的 1200 万倍,另一个是地球质量的 1000 万倍)靠得很近,开始互相绕行。它们的能量逐渐减少,彼此越来越近,速度也越来越快,在 350 公里的距离上达到了每秒 250 次的轨道频率,最终相撞并合并。37过去的几毫秒内,以引力波形式发射的能量比宇宙中所有恒星的总能量输出高出 50 倍。2015 年 9 月 14 日,这些引力波到达地球。它们交替膨胀和压缩空间本身,膨胀率约为 2.5 十亿分之一,相当于将到比邻星(4.4 光年)的距离改变了一根人的头发的宽度。

Approximately 1.4 billion years ago and 8.2 sextillion miles away, two black holes, one twelve million times the mass of the Earth and the other ten million, came close enough to begin orbiting each other. Gradually losing energy, they spiraled closer and closer to each other and faster and faster, reaching an orbital frequency of 250 times per second at a distance of 350 kilometers before finally colliding and merging.37 In the last few milliseconds, the rate of energy emission in the form of gravitational waves was fifty times larger than the total energy output of all the stars in the universe. On September 14, 2015, those gravitational waves arrived at the Earth. They alternately expanded and compressed space itself by a factor of about one in 2.5 sextillion, equivalent to changing the distance to Proxima Centauri (4.4 light years) by the width of a human hair.

幸运的是,两天前,华盛顿和路易斯安那州的先进 LIGO(激光干涉引力波天文台)探测器已经启动。利用激光干涉测量法,他们能够测量空间的微小扭曲;利用基于爱因斯坦广义相对论的计算,LIGO 研究人员预测了——并因此正在寻找——此类事件预期产生的引力波形的确切形状。38

Fortunately, two days earlier, the Advanced LIGO (Laser Interferometer Gravitational-Wave Observatory) detectors in Washington and Louisiana had been switched on. Using laser interferometry, they were able to measure the minuscule distortion of space; using calculations based on Einstein’s theory of general relativity, the LIGO researchers had predicted—and were therefore looking for—the exact shape of the gravitational waveform expected from such an event.38

这是因为数千人通过数个世纪的观察和研究积累和交流了知识和概念。从米利都的泰勒斯用羊毛摩擦琥珀并观察静电的积累,到伽利略从比萨斜塔上扔下石块,到牛顿看到树上掉下苹果,通过千万次的观察,人类逐渐积累了层层的概念、理论和装置:质量、速度、加速度、力、牛顿运动定律和万有引力、轨道方程、电现象、原子、电子、电场、磁场、电磁波、狭义相对论、广义相对论、量子力学、半导体、激光、计算机等等。

This was possible because of the accumulation and communication of knowledge and concepts by thousands of people across centuries of observation and research. From Thales of Miletus rubbing amber with wool and observing the static charge buildup, through Galileo dropping rocks from the Leaning Tower of Pisa, to Newton seeing an apple fall from a tree, and on through thousands more observations, humanity has gradually accumulated layer upon layer of concepts, theories, and devices: mass, velocity, acceleration, force, Newton’s laws of motion and gravitation, orbital equations, electrical phenomena, atoms, electrons, electric fields, magnetic fields, electromagnetic waves, special relativity, general relativity, quantum mechanics, semiconductors, lasers, computers, and so on.

现在,原则上,我们可以将这一发现过程理解为从所有人类所体验过的所有感官数据映射到 LIGO 科学家在 2015 年 9 月 14 日观看计算机屏幕时体验到的感官数据的非常复杂的假设。这是纯数据驱动的学习观点:数据输入,假设输出,中间是黑匣子。如果可以做到这一点,那将是“大数据、大网络”深度学习方法的典范,但这是不可能的。对于智能实体如何实现检测两个黑洞合并这一惊人壮举,我们唯一合理的想法是,物理的先验知识与仪器的观测数据相结合,使 LIGO 科学家能够推断出合并事件的发生。此外,这种先验知识本身就是利用先验知识学习的结果——以此类推,一直追溯到历史。因此,我们大致了解了智能实体如何以知识为构建材料构建预测能力的累积图景。

Now, in principle we can understand this process of discovery as a mapping from all the sensory data ever experienced by all humans to a very complex hypothesis about the sensory data experienced by the LIGO scientists on September 14, 2015, as they watched their computer screens. This is the purely data-driven view of learning: data in, hypothesis out, black box in between. If it could be done, it would be the apotheosis of the “big data, big network” deep learning approach, but it cannot be done. The only plausible idea we have for how intelligent entities could achieve such a stupendous feat as detecting the merger of two black holes is that prior knowledge of physics, combined with the observational data from their instruments, allowed the LIGO scientists to infer the occurrence of the merger event. Moreover, this prior knowledge was itself the result of learning with prior knowledge—and so on, all the way back through history. Thus, we have a roughly cumulative picture of how intelligent entities can build predictive capabilities, with knowledge as the building material.

我之所以说大致如此,是因为在过去的几个世纪里,科学曾走上过一些错误的路,暂时追求过燃素和光以太等虚幻的概念。但我们知道,累积的图景是实际发生的事情,因为科学家一路上都在书籍和论文中写下了他们的发现和理论。后来的科学家只能接触到这些显性知识,而无法接触到早已逝去的早期几代人的原始感官体验。因为他们是科学家,所以成员LIGO 团队的科学家明白,他们使用的所有知识,包括爱因斯坦的广义相对论,都处于(并且将永远处于)试用期,可以通过实验来证伪。事实证明,LIGO 数据为广义相对论提供了强有力的证据,并进一步证明了引力子(一种假设的传递引力的粒子)是无质量的。

I say roughly because, of course, science has taken a few wrong turns over the centuries, temporarily pursuing illusory notions such as phlogiston and the luminiferous aether. But we know for a fact that the cumulative picture is what actually happened, in the sense that scientists all along the way wrote down their findings and theories in books and papers. Later scientists had access only to these forms of explicit knowledge, and not to the original sensory experiences of earlier, long-dead generations. Because they are scientists, the members of the LIGO team understood that all the pieces of knowledge they used, including Einstein’s theory of general relativity, are (and always will be) in their probationary period and could be falsified by experiment. As it turned out, the LIGO data provided strong confirmation for general relativity as well as further evidence that the graviton—a hypothesized particle that mediates the force of gravity—is massless.

我们距离创建能够匹敌或超越科学界或普通人一生所展现的累积学习和发现能力的机器学习系统还有很长的路要走。39深度学习系统D主要是数据驱动的:充其量,我们可以在网络结构中“插入”一些非常弱的先验知识。概率编程系统C确实允许在学习过程中使用先验知识,如概率知识库的结构和词汇所表达的那样,但我们还没有有效的方法来生成新的概念和关系并使用它们来扩展这样的知识库。

We are a very long way from being able to create machine learning systems that are capable of matching or exceeding the capacity for cumulative learning and discovery exhibited by the scientific community—or by ordinary human beings in their own lifetimes.39 Deep learning systemsD are mostly data driven: at best, we can “wire in” some very weak forms of prior knowledge in the structure of the network. Probabilistic programming systemsC do allow for prior knowledge in the learning process, as expressed in the structure and vocabulary of the probabilistic knowledge base, but we do not yet have effective methods for generating new concepts and relationships and using them to expand such a knowledge base.

困难不在于找到能够很好地拟合数据的假设;深度学习系统可以找到与图像数据非常吻合的假设,人工智能研究人员已经构建了符号学习程序,能够重现许多定量科学定律的历史发现。40自主智能代理的学习需要的远不止这些。

The difficulty is not one of finding hypotheses that provide a good fit to data; deep learning systems can find hypotheses that are a good fit to image data, and AI researchers have built symbolic learning programs able to recapitulate many historical discoveries of quantitative scientific laws.40 Learning in an autonomous intelligent agent requires much more than this.

首先,预测的“数据”应该包括什么?例如,在 LIGO 实验中,预测引力波到达时空间拉伸和收缩量的模型考虑了碰撞黑洞的质量、它们的轨道频率等,但没有考虑星期几或美国职业棒球大联盟比赛的发生。另一方面,预测旧金山湾大桥交通状况的模型考虑了星期几和美国职业棒球大联盟比赛的发生,但忽略了碰撞黑洞的质量和轨道频率。同样,学习识别图像中物体类型的程序使用像素作为输入,而学习估算古董价值的程序还想知道它是由什么制成的、谁制造的、何时制造的、它的使用和所有权历史等等。为什么会这样?显然,这是因为我们人类已经对引力波、交通、视觉图像和古董有所了解。我们利用这些知识来决定预测特定输出需要哪些输入。这称为特征工程,做好特征工程需要对特定的预测问题有很好的理解。

First, what should be included in the “data” from which predictions are made? For example, in the LIGO experiment, the model for predicting the amount that space stretches and shrinks when a gravitational wave arrives takes into account the masses of the colliding black holes, the frequency of their orbits, and so on, but it doesn’t take into account the day of the week or the occurrence of Major League baseball games. On the other hand, a model for predicting traffic on the San Francisco Bay Bridge takes into account the day of the week and the occurrence of Major League baseball games but ignores the masses and orbital frequencies of colliding black holes. Similarly, programs that learn to recognize the types of objects in images use the pixels as input, whereas a program that learns to estimate the value of an antique object would also want to know what it was made of, who made it and when, its history of usage and ownership, and so on. Why is this? Obviously, it’s because we humans already know something about gravitational waves, traffic, visual images, and antiques. We use this knowledge to decide which inputs are needed for predicting a specific output. This is called feature engineering, and doing it well requires a good understanding of the specific prediction problem.

当然,真正的智能机器不能依赖于每次有新东西要学习时都有人类特征工程师出现。它必须自己弄清楚什么构成了学习问题的合理假设空间。据推测,它将通过以各种形式运用广泛的相关知识来做到这一点,但目前我们对如何做到这一点只有基本的想法。41纳尔逊·古德曼的《事实、虚构和预测》42 写于 1954 年,可能是机器学习方面最重要和最被低估的书籍之一 —提出了一种称为过度假设的知识,因为它有助于定义合理假设的空间可能是什么。例如,在交通预测的情况下,相关的过度假设是星期几、一天中的时间、当地事件、最近的事故、假期、交通延误、天气以及日出和日落时间都会影响交通状况。 (请注意,您可以根据自己对世界的背景知识找出这种过度假设,而无需成为交通专家。)智能学习系统可以积累和使用此类知识来帮助制定和解决新的学习问题。

Of course, a real intelligent machine cannot rely on human feature engineers showing up every time there is something new to learn. It will have to work out for itself what constitutes a reasonable hypothesis space for a learning problem. Presumably, it will do this by bringing to bear a wide range of relevant knowledge in various forms, but at present we have only rudimentary ideas about how to do this.41 Nelson Goodman’s Fact, Fiction, and Forecast42—written in 1954 and perhaps one of the most important and underappreciated books on machine learning—suggests a kind of knowledge called an overhypothesis, because it helps to define what the space of reasonable hypotheses might be. In the case of traffic prediction, for example, the relevant overhypothesis would be that the day of the week, time of day, local events, recent accidents, holidays, transit delays, weather, and sunrise and sunset times can influence traffic conditions. (Notice that you can figure out this overhypothesis from your own background knowledge of the world, without being a traffic expert.) An intelligent learning system can accumulate and use knowledge of this kind to help formulate and solve new learning problems.

其次,也许更重要的是,质量、加速度、电荷、电子和引力等新概念的不断产生。如果没有这些概念,科学家(和普通人)就必须根据原始的感知输入来解释他们的宇宙并做出预测。相反,牛顿能够利用伽利略和其他人则认为,卢瑟福之所以能确定原子是由致密的、带正电的原子核和被电子包围的原子核组成,是因为电子的概念早在十九世纪末就已被发展起来(由众多研究人员一步步发展起来);事实上,所有科学发现都依赖于层层递进的概念,这些概念可以追溯到时间和人类经验之中。

Second, and perhaps more important, is the cumulative generation of new concepts such as mass, acceleration, charge, electron, and gravitational force. Without these concepts, scientists (and ordinary people) would have to interpret their universe and make predictions on the basis of raw perceptual inputs. Instead, Newton was able to work with concepts of mass and acceleration developed by Galileo and others; Rutherford could determine that the atom was composed of a dense, positively charged nucleus surrounded by electrons because the concept of an electron had already been developed (by numerous researchers in small steps) in the late nineteenth century; indeed, all scientific discoveries rely on layer upon layer of concepts that stretch back through time and human experience.

在科学哲学中,尤其是在二十世纪早期,新概念的发现常常被归因于三个不可言喻的“我”:直觉、洞察力和灵感。所有这些都被认为无法用任何理性或算法来解释。包括赫伯特·西蒙在内的人工智能研究人员43强烈反对这种观点。简而言之,如果机器学习算法可以在假设空间中进行搜索,并且有可能为输入中不存在的新术语添加定义,那么该算法就可以发现新概念。

In the philosophy of science, particularly in the early twentieth century, it was not uncommon to see the discovery of new concepts attributed to the three ineffable I’s: intuition, insight, and inspiration. All these were considered resistant to any rational or algorithmic explanation. AI researchers, including Herbert Simon,43 have objected strongly to this view. Put simply, if a machine learning algorithm can search in a space of hypotheses that includes the possibility of adding definitions for new terms not present in the input, then the algorithm can discover new concepts.

例如,假设一个机器人试图通过观察人们玩游戏来学习西洋双陆棋的规则。它观察人们如何掷骰子,并注意到有时玩家会移动三四个棋子而不是一两个,这种情况发生在掷出 1-1、2-2、3-3、4-4、5-5 或 6-6 之后。如果程序可以添加一个新的双倍概念(由两个骰子相等定义),它就可以更简洁地表达相同的预测理论。这是一个简单的过程,使用归纳逻辑编程等方法44来创建程序,提出新的概念和定义,以便识别既准确又简洁的理论。

For example, suppose that a robot is trying to learn the rules of backgammon by watching people playing the game. It observes how they roll the dice and notices that sometimes players move three or four pieces rather than one or two and that this happens after a roll of 1-1, 2-2, 3-3, 4-4, 5-5, or 6-6. If the program can add a new concept of doubles, defined by equality between the two dice, it can express the same predictive theory much more concisely. It is a straightforward process, using methods such as inductive logic programming,44 to create programs that propose new concepts and definitions in order to identify theories that are both accurate and concise.

目前,我们知道如何针对相对简单的情况做到这一点,但对于更复杂的理论,可以引入的新概念数量将变得非常庞大。这使得深度学习方法在计算机视觉领域的成功更加引人注目。深度网络通常能够成功找到有用的中间特征,例如眼睛、腿、条纹和角落,即使它们使用的是非常简单的学习算法。如果我们能更好地理解这种情况是如何发生的,我们就可以将同样的方法应用于学习用科学所需的更具表达力的语言来表达新概念。这本身就是人类的一大福音,也是迈向通用人工智能的重要一步。

At present, we know how to do this for relatively simple cases, but for more complex theories the number of possible new concepts that could be introduced becomes simply enormous. This makes the recent success of deep learning methods in computer vision all the more intriguing. The deep networks usually succeed in finding useful intermediate features such as eyes, legs, stripes, and corners, even though they are using very simple learning algorithms. If we can understand better how this happens, we can apply the same approach to learning new concepts in the more expressive languages needed for science. This by itself would be a huge boon to humanity as well as a significant step towards general-purpose AI.

发现行动

Discovering actions

长期尺度上的智能行为需要在多个抽象层次上分层次地规划和管理活动的能力——从攻读博士学位(一万亿个动作)到在求职信中输入单个字符时发送给一个手指的单个运动控制命令。

Intelligent behavior over long time scales requires the ability to plan and manage activity hierarchically, at multiple levels of abstraction—all the way from doing a PhD (one trillion actions) to a single motor control command sent to one finger as part of typing a single character in the application cover letter.

我们的活动被组织成复杂的层次结构,包含数十个抽象级别。这些级别及其包含的操作是我们文明的重要组成部分,并通过我们的语言和实践代代相传。例如,捕捉野猪申请签证购买机票等行为可能涉及数百万个原始动作,但我们可以将它们视为单个单元,因为它们已经存在于我们的语言和文化提供的动作“库”中,而且我们(大致)知道如何去做。

Our activities are organized into complex hierarchies with dozens of levels of abstraction. These levels and the actions they contain are a key part of our civilization and are handed down through generations via our language and practices. For example, actions such as catching a wild boar and applying for a visa and buying a plane ticket may involve millions of primitive actions, but we can think about them as single units because they are already in the “library” of actions that our language and culture provides and because we know (roughly) how to do them.

一旦它们进入库中,我们就可以将这些高级动作串联成更高级别的动作,例如在夏至时举行部落宴会或在尼泊尔偏远地区进行夏季考古研究。试图从头开始规划此类活动(从最低级别的运动控制步骤开始)是完全不可能的,因为此类活动涉及数百万或数十亿个步骤,其中许多步骤非常难以预测。(野猪会在哪里被发现,它会朝哪个方向跑?)另一方面,如果库中有合适的高级动作,则只需规划十几个步骤,因为每个步骤都是整个活动的很大一部分。这是我们脆弱的人类大脑可以做到的事情——但它赋予了我们长期规划的“超能力”。

Once they are in the library, we can string these high-level actions together into still higher-level actions, such as having a tribal feast for the summer solstice or doing archaeological research for a summer in a remote part of Nepal. Trying to plan such activities from scratch, starting with the lowest-level motor control steps, would be completely hopeless because such activities involve millions or billions of steps, many of which are very unpredictable. (Where will the wild boar be found, and which way will he run?) With suitable high-level actions in the library, on the other hand, one need plan only a dozen or so steps, because each such step is a large piece of the overall activity. This is something that even our feeble human brains can manage—but it gives us the “superpower” of planning over long time scales.

曾经有一段时间,这些行动并不存在——因为例如,在 1910 年,要获得乘坐飞机的权利,需要经过漫长、复杂且难以预测的研究、写信和与各种航空先驱谈判的过程。最近添加到库中的其他操作包括发送电子邮件、谷歌搜索和优步。正如阿尔弗雷德·诺斯·怀特黑德 (Alfred North Whitehead) 在 1911 年所写的那样,“文明的进步在于扩大我们可以不假思索地执行的重要操作的数量。” 45

There was a time when these actions didn’t exist as such—for example, to obtain the right to a plane journey in 1910 would have required a long, involved, and unpredictable process of research, letter writing, and negotiation with various aeronautical pioneers. Other actions recently added to the library include emailing, googling, and ubering. As Alfred North Whitehead wrote in 1911, “Civilization advances by extending the number of important operations which we can perform without thinking about them.”45

图 6:索尔·斯坦伯格的《从第九大道看世界》 ,1976 年,首次作为《纽约客》杂志的封面发表。

FIGURE 6: Saul Steinberg’s View of the World from 9th Avenue, 1976, first published as a cover of The New Yorker magazine.

索尔·斯坦伯格为《纽约客》设计的著名封面(图 6)以空间形式出色地展示了智能代理如何管理自己的未来。非常近的未来非常详细——事实上,我的大脑已经加载了特定的运动控制序列用于输入接下来的几个单词。再往前看,细节就少了——我的计划是完成这一部分,吃午饭,再写点东西,然后看法国队和克罗地亚队在世界杯决赛中对阵。再往前看,我的计划更大,但更模糊:8 月初从巴黎搬回伯克利,教一门研究生课程,然后完成这本书。随着时间的推移,未来离现在越来越近,对未来的计划也变得更加详细,而遥远的未来可能会增加新的、模糊的计划。近期的计划变得如此详细,以至于它们可以直接由运动控制系统执行。

Saul Steinberg’s famous cover for The New Yorker (figure 6) brilliantly shows, in spatial form, how an intelligent agent manages its own future. The very immediate future is extraordinarily detailed—in fact, my brain has already loaded up the specific motor control sequences for typing the next few words. Looking a bit further ahead, there is less detail—my plan is to finish this section, have lunch, write some more, and watch France play Croatia in the final of the World Cup. Still further ahead, my plans are larger but vaguer: move back from Paris to Berkeley in early August, teach a graduate course, and finish this book. As one moves through time, the future moves closer to the present and the plans for it become more detailed, while new, vague plans may be added to the distant future. Plans for the immediate future become so detailed that they are executable directly by the motor control system.

目前,我们只掌握了人工智能系统整体情况的一小部分。如果能提供抽象动作的层次结构(包括如何将每个抽象动作细化为由更多具体动作组成的子计划的知识),那么我们就有算法可以构建复杂的计划来实现特定目标。有些算法可以执行抽象的分层计划,这样代理就始终有一个“随时可用”的原始物理动作,即使未来的动作仍处于抽象级别且尚未执行。

At present we have only some pieces of this overall picture in place for AI systems. If the hierarchy of abstract actions is provided—including knowledge of how each abstract action can be refined into a subplan composed of more concrete actions—then we have algorithms that can construct complex plans to achieve specific goals. There are algorithms that can execute abstract, hierarchical plans in such a way that the agent always has a primitive, physical action “ready to go,” even if actions in the future are still at an abstract level and not yet executable.

难题中缺失的主要部分是首先构建抽象动作层次的方法。例如,是否可以从头开始,让机器人只知道它可以向各种电机发送各种电流,并让它自己发现站起来的动作?重要的是要明白,我不是问我们是否可以训练机器人站起来,这可以通过应用强化学习来实现,如果机器人的头部离地面较远,就会得到奖励。46训练机器人站起来需要人类训练师已经知道站起来意味着什么,这样才能定义正确的奖励信号。我们希望机器人自己发现站起来是一件事情——一个有用的抽象动作,它满足了行走、跑步、握手或看墙的先决条件(直立),因此构成了各种目标的许多抽象计划的一部分。同样地,我们希望机器人能够发现诸如从一个地方移动到另一个地方、捡起物体、开门、打结、做饭、找钥匙、建房屋等动作,以及许多其他在任何人类语言中都没有名称的动作,因为我们人类还没有发现它们。

The main missing piece of the puzzle is a method for constructing the hierarchy of abstract actions in the first place. For example, is it possible to start from scratch with a robot that knows only that it can send various electric currents to various motors and have it discover for itself the action of standing up? It’s important to understand that I’m not asking whether we can train a robot to stand up, which can be done simply by applying reinforcement learning with a reward for the robot’s head being farther away from the ground.46 Training a robot to stand up requires that the human trainer already knows what standing up means, so that the right reward signal can be defined. What we want is for the robot to discover for itself that standing up is a thing—a useful abstract action, one that achieves the precondition (being upright) for walking or running or shaking hands or seeing over a wall and so forms part of many abstract plans for all kinds of goals. Similarly, we want the robot to discover actions such as moving from place to place, picking up objects, opening doors, tying knots, cooking dinner, finding my keys, building houses, and many other actions that have no names in any human language because we humans have not discovered them yet.

我认为这种能力是达到人类水平的人工智能所需的最重要的一步。再次借用怀特黑德的话来说,它将扩展人工智能系统无需思考即可执行的重要操作的数量。世界各地的许多研究小组都在努力解决这个问题。例如,DeepMind 在 2018 年的论文中展示了在 Quake III Arena Capture the Flag 上人类水平的表现,声称他们的学习系统“以一种新颖的方式构建了一个时间层次化的表示空间,以促进……时间连贯的动作序列。” 47(我不完全确定这意味着什么,但这听起来确实像是朝着发明新的高级动作的目标迈进了一步。)我怀疑我们还没有完整的答案,但这是一个随时可能发生的进步,只要以正确的方式将一些现有的想法放在一起。

I believe this capability is the most important step needed to reach human-level AI. It would, to borrow Whitehead’s phrase again, extend the number of important operations that AI systems can perform without thinking about them. Numerous research groups around the world are hard at work on solving the problem. For example, DeepMind’s 2018 paper showing human-level performance on Quake III Arena Capture the Flag claims that their learning system “constructs a temporally hierarchical representation space in a novel way to promote . . . temporally coherent action sequences.”47 (I’m not completely sure what this means, but it certainly sounds like progress towards the goal of inventing new high-level actions.) I suspect that we do not yet have the complete answer, but this is an advance that could occur any moment, just by putting some existing ideas together in the right way.

拥有这种能力的智能机器将能够比人类更深入地洞察未来。它们还将能够考虑更多的信息。这两种能力结合起来必然会为现实世界带来更好的决策。在人类与机器之间的任何冲突情况下,我们很快就会发现,就像加里·卡斯帕罗夫和李世石一样,我们的每一步都已经被预料到并被阻止。比赛还没开始,我们就会输掉。

Intelligent machines with this capability would be able to look further into the future than humans can. They would also be able to take into account far more information. These two capabilities combined lead inevitably to better real-world decisions. In any kind of conflict situation between humans and machines, we would quickly find, like Garry Kasparov and Lee Sedol, that our every move has been anticipated and blocked. We would lose the game before it even started.

管理心理活动

Managing mental activity

如果管理现实世界中的活动似乎很复杂,请想想你可怜的大脑,管理“已知宇宙中最复杂物体”的活动——它本身。我们一开始并不知道如何思考,就像我们一开始并不知道如何走路或弹钢琴。我们学会了如何弹钢琴。在某种程度上,我们可以选择要有什么样的想法。(来,想一想一个多汁的汉堡包或保加利亚海关规定——你自己选择吧!)在某些方面,我们的心理活动比现实世界中的活动更复杂,因为我们的大脑比我们的身体有更多的活动部件,而且这些部件移动得更快。计算机也是如此:AlphaGo 在围棋棋盘上的每一步,它都要执行数百万数十亿个计算单元,每个计算单元都涉及在前瞻搜索树中添加一个分支,并评估该分支末端的棋盘位置。而每个计算单元的发生都是因为程序会选择下一步探索树的哪个部分。大致来说,AlphaGo 会选择它认为可以改善其最终棋盘决策的计算。

If managing activity in the real world seems complex, spare a thought for your poor brain, managing the activity of the “most complex object in the known universe”—itself. We don’t start out knowing how to think, any more than we start out knowing how to walk or play the piano. We learn how to do it. We can, to some extent, choose what thoughts to have. (Go on, think about a juicy hamburger or Bulgarian customs regulations—your choice!) In some ways, our mental activity is more complex than our activity in the real world, because our brains have far more moving parts than our bodies and those parts move much faster. The same is true for computers: for every move that AlphaGo makes on the Go board, it performs millions or billions of units of computation, each of which involves adding a branch to the lookahead search tree and evaluating the board position at the end of that branch. And each of those units of computation happens because the program makes a choice about which part of the tree to explore next. Very approximately, AlphaGo chooses computations that it expects will improve its eventual decision on the board.

AlphaGo 的计算活动简单而同质,因此可以制定出一个合理的方案来管理它的计算活动:每个计算单元都是同一类型。与使用相同基本计算单元的其他程序相比,AlphaGo 可能非常高效,但与其他类型的程序相比,它可能极其低效。例如,在 2016 年划时代比赛中,AlphaGo 的人类对手李世石每一步的计算单元可能不超过几千个,但他拥有更灵活的计算架构,拥有更多种类的计算单元:包括将棋盘划分为子游戏并尝试解决它们之间的相互作用;识别可能实现的目标并制定高级计划,例如“让这个组存活”或“阻止我的对手连接这两个组”;思考如何实现特定目标,例如让一个组存活;以及排除整类动作,因为它们无法解决重大威胁。

It has been possible to work out a reasonable scheme for managing AlphaGo’s computational activity because that activity is simple and homogeneous: every unit of computation is of the same kind. Compared to other programs that use that same basic unit of computation, AlphaGo is probably quite efficient, but it’s probably extremely inefficient compared to other kinds of programs. For example, Lee Sedol, AlphaGo’s human opponent in the epochal match of 2016, probably does no more than a few thousand units of computation per move, but he has a much more flexible computational architecture with many more kinds of units of computation: these include dividing the board into subgames and trying to resolve their interactions; recognizing possible goals to attain and making high-level plans with actions like “keep this group alive” or “prevent my opponent from connecting these two groups”; thinking about how to achieve a specific goal, such as keeping a group alive; and ruling out whole classes of moves because they fail to address a significant threat.

我们根本不知道如何组织如此复杂和多样的计算活动——如何整合和借鉴各自的结果,如何为不同类型的审议分配计算资源,以便尽快做出正确的决策。可能。但很明显,像 AlphaGo 这样的简单计算架构不可能在现实世界中发挥作用,因为我们通常需要处理的决策范围不是数百个,而是数十亿个原始步骤,而且任何一点的可能动作数量几乎是无限的。重要的是要记住,现实世界中的智能代理并不局限于下围棋,甚至不局限于找到 Stuart 的钥匙——它只是存在。它可以做任何事情,但它不可能有时间去思考它可能做的所有事情。

We simply don’t know how to organize such complex and varied computational activity—how to integrate and build on the results from each and how to allocate computational resources to the various kinds of deliberation so that good decisions are found as quickly as possible. It is clear, however, that a simple computational architecture like AlphaGo’s cannot possibly work in the real world, where we routinely need to deal with decision horizons of not tens but billions of primitive steps and where the number of possible actions at any point is almost infinite. It’s important to remember that an intelligent agent in the real world is not restricted to playing Go or even finding Stuart’s keys—it’s just being. It can do anything next, but it cannot possibly afford to think about all the things it might do.

如果一个系统既能发现新的高级操作(如前所述),又能管理其计算活动,专注于快速显著提高决策质量的计算单元,那么它将成为现实世界中强大的决策者。与人类一样,它的思考将是“认知高效的”,但它不会受到微小的短期记忆和缓慢的硬件的影响,而这些因素严重限制了我们展望未来、处理大量突发事件和考虑大量备选方案的能力。

A system that can both discover new high-level actions—as described earlier—and manage its computational activity to focus on units of computation that quickly deliver significant improvements in decision quality would be a formidable decision maker in the real world. Like those of humans, its deliberations would be “cognitively efficient,” but it would not suffer from the tiny short-term memory and slow hardware that severely limit our ability to look far into the future, handle a large number of contingencies, and consider a large number of alternative plans.

还缺少一些东西吗?

More things missing?

如果我们将我们所知道的所有方法与本章列出的所有潜在新发展结合起来,它会起作用吗?由此产生的系统将如何运作?它将随着时间的推移,吸收大量信息,并通过观察和推理大规模地跟踪世界的状态。它将逐渐改进它的世界模型(当然包括人类模型)。它将使用这些模型来解决复杂的问题,它将封装和重复使用其解决过程,使其审议更有效率,并能够解决更复杂的问题。它会发现新的概念和行动,这将使它能够提高发现速度。它将在越来越长的时间尺度上制定有效的计划。

If we put together everything we know how to do with all the potential new developments listed in this chapter, would it work? How would the resulting system behave? It would plow through time, absorbing vast quantities of information and keeping track of the state of the world on a massive scale by observation and inference. It would gradually improve its models of the world (which include models of humans, of course). It would use those models to solve complex problems and it would encapsulate and reuse its solution processes to make its deliberations more efficient and to enable the solution of still more complex problems. It would discover new concepts and actions, and these would allow it to improve its rate of discovery. It would make effective plans over increasingly long time scales.

总而言之,没有其他什么伟大的从有效实现目标的系统的角度来看,意义不大。当然,唯一能确定的方法是建立它(一旦取得突破),然后看看会发生什么。

In summary, it’s not obvious that anything else of great significance is missing, from the point of view of systems that are effective in achieving their objectives. Of course, the only way to be sure is to build it (once the breakthroughs have been achieved) and see what happens.

想象一个超级智能机器

Imagining a Superintelligent Machine

在讨论超级智能 AI 的性质和影响时,技术界缺乏想象力。我们经常看到关于减少医疗错误、48更安全的汽车49或其他渐进式进步的讨论。机器人被想象成携带大脑的个体实体,而事实上它们很可能通过无线方式连接到一个单一的全球实体,利用巨大的固定计算资源。研究人员似乎害怕研究 AI 成功的真实后果。

The technical community has suffered from a failure of imagination when discussing the nature and impact of superintelligent AI. Often, we see discussions of reduced medical errors,48 safer cars,49 or other advances of an incremental nature. Robots are imagined as individual entities carrying their brains with them, whereas in fact they are likely to be wirelessly connected into a single, global entity that draws on vast stationary computing resources. It’s as if researchers are afraid of examining the real consequences of success in AI.

理论上,通用智能系统可以做任何人类能做的事情。例如,一些人做了大量的数学、算法设计、编码和实证研究,才发明了现代搜索引擎。所有这些工作的成果非常有用,当然也很有价值。有多有价值?最近的一项研究表明,接受调查的美国成年人中位数需要支付至少 17,500 美元才能放弃使用搜索引擎一年,相当于全球数十万亿美元的价值。

A general-purpose intelligent system can, by assumption, do what any human can do. For example, some humans did a lot of mathematics, algorithm design, coding, and empirical research to come up with the modern search engine. The results of all this work are very useful and of course very valuable. How valuable? A recent study showed that the median American adult surveyed would need to be paid at least $17,500 to give up using search engines for a year,50 which translates to a global value in the tens of trillions of dollars.

现在想象一下,搜索引擎还不存在,因为必要的几十年的工作还没有完成,但你可以使用超级智能的人工智能系统。只需提出问题,你现在就可以使用搜索引擎技术,这要归功于人工智能系统。完成了!价值数万亿美元的技术,只需提出问题,你无需编写一行额外的代码。任何其他缺失的发明或一系列发明也是如此:如果人类可以做到,机器也可以做到。

Now imagine that search engines don’t exist yet because the necessary decades of work have not been done, but you have access instead to a superintelligent AI system. Simply by asking the question, you now have access to search engine technology, courtesy of the AI system. Done! Trillions of dollars in value, just for the asking, and not a single line of additional code written by you. The same goes for any other missing invention or series of inventions: if humans could do it, so can the machine.

最后一点提供了一个有用的下限——悲观的估计超级智能机器能做什么。根据假设,机器比单个人类更有能力。有很多事情是单个人类无法做到的,但一群n个人可以做到:将宇航员送上月球,制造引力波探测器,对人类基因组进行测序,管理一个拥有数亿人口的国家。因此,粗略地说,我们创建了机器的n 个软件副本,并以与n个人类相同的方式(使用相同的信息和控制流)将它们连接起来。现在我们有了一台机器,它可以做n个人类能做的一切,而且做得更好,因为它的n 个组件中的每一个都是超人的。

This last point provides a useful lower bound—a pessimistic estimate—on what a superintelligent machine can do. By assumption, the machine is more capable than an individual human. There are many things an individual human cannot do, but a collection of n humans can do: put an astronaut on the Moon, create a gravitational-wave detector, sequence the human genome, run a country with hundreds of millions of people. So, roughly speaking, we create n software copies of the machine and connect them in the same way—with the same information and control flows—as the n humans. Now we have a machine that can do whatever n humans can do, except better, because each of its n components is superhuman.

这种智能系统的多智能体合作设计只是机器可能能力的下限,因为还有其他更好的设计。在n个人的集合中,总可用信息分别保存在n 个大脑中,并且在它们之间非常缓慢且不完美地进行交流。这就是为什么n个人大部分时间都花在开会上。在机器中,不需要这种分离,这通常会阻止连接点。对于科学发现中不连贯的点的一个例子,简要浏览一下青霉素的悠久历史就让人大开眼界。51

This multi-agent cooperation design for an intelligent system is just a lower bound on the possible capabilities of machines because there are other designs that work better. In a collection of n humans, the total available information is kept separately in n brains and communicated very slowly and imperfectly between them. That’s why the n humans spend most of their time in meetings. In the machine, there is no need for this separation, which often prevents connecting the dots. For an example of disconnected dots in scientific discovery, a brief perusal of the long history of penicillin is quite eye-opening.51

另一种拓展想象力的有效方法是思考某种特定形式的感官输入——比如阅读——并将其扩大。人类可以在一周内阅读并理解一本书,而机器可以在几个小时内阅读并理解有史以来所有书籍——全部 1.5 亿本。这需要相当多的处理能力,但书籍可以基本并行阅读,这意味着只需添加更多芯片,机器就可以扩大其阅读过程。同样,机器可以通过卫星、机器人和数亿个监控摄像头同时看到一切;观看世界上所有的电视广播;收听世界上所有的广播电台和电话交谈。很快,它就会对世界及其居民有比任何人可能希望获得的更详细和更准确的了解。

Another useful method of stretching your imagination is to think about some particular form of sensory input—say, reading—and scale it up. Whereas a human can read and understand one book in a week, a machine could read and understand every book ever written—all 150 million of them—in a few hours. This requires a decent amount of processing power, but the books can be read largely in parallel, meaning that simply adding more chips allows the machine to scale up its reading process. By the same token, the machine can see everything at once through satellites, robots, and hundreds of millions of surveillance cameras; watch all the world’s TV broadcasts; and listen to all the world’s radio stations and phone conversations. Very quickly it would gain a far more detailed and accurate understanding of the world and its inhabitants than any human could possibly hope to acquire.

我们还可以想象扩大机器的行动能力。人类只能直接控制一个身体,而机器可以控制数千甚至数百万个身体。一些自动化工厂已经展现出这一特点。在工厂外,一台控制数千个灵巧机器人的机器可以建造大量房屋,每栋房屋都根据未来居住者的需求和愿望量身定制。在实验室中,现有的用于科学研究的机器人系统可以扩大规模,同时进行数百万次实验——也许可以创建完整的人类生物学预测模型,直至分子水平。请注意,机器的推理能力将使其更有能力检测科学理论之间以及理论与观察之间的不一致之处。事实上,我们可能已经有足够的生物学实验证据来设计出治疗癌症的方法:我们只是还没有把它们整合在一起。

One can also imagine scaling the machine’s capacity for action. A human has direct control over only one body, while a machine can control thousands or millions. Some automated factories already exhibit this characteristic. Outside the factory, a machine that controls thousands of dexterous robots can, for example, produce vast numbers of houses, each one tailored to its future occupants’ needs and desires. In the lab, existing robotic systems for scientific research could be scaled up to perform millions of experiments simultaneously—perhaps to create complete predictive models of human biology down to the molecular level. Note that the machine’s reasoning capabilities will give it a far greater capacity to detect inconsistencies between scientific theories and between theories and observations. Indeed, it may already be the case that we have enough experimental evidence about biology to devise a cure for cancer: we just haven’t put it together.

在网络领域,机器已经可以访问数十亿个效应器,即世界上所有手机和电脑上的显示屏。这在一定程度上解释了 IT 公司能够用很少的员工创造巨额财富;也表明人类极易受到屏幕操纵。

In the cyber realm, machines already have access to billions of effectors—namely, the displays on all the phones and computers in the world. This partly explains the ability of IT companies to generate enormous wealth with very few employees; it also points to the severe vulnerability of the human race to manipulation via screens.

另一种规模来自于机器比人类更准确地预测未来的能力。我们已经在国际象棋和围棋中看到了这一点;凭借在长期尺度上生成和分析分层计划的能力以及识别新的抽象动作和高级描述模型的能力,机器将把这种优势转移到数学(证明新颖、有用的定理)和现实世界中的决策等领域。在发生环境灾难时疏散大城市等任务将相对简单,机器能够为每个人和车辆提供个性化指导,以最大限度地减少伤亡人数。

Scale of a different kind comes from the machine’s ability to look further into the future, with greater accuracy, than is possible for humans. We have seen this for chess and Go already; with the capacity for generating and analyzing hierarchical plans over long time scales and the ability to identify new abstract actions and high-level descriptive models, machines will transfer this advantage to domains such as mathematics (proving novel, useful theorems) and decision making in the real world. Tasks such as evacuating a large city in the event of an environmental disaster will be relatively straightforward, with the machine able to generate individual guidance for every person and vehicle to minimize the number of casualties.

在制定防止全球变暖的政策建议时,机器可能会费点力气。地球系统建模需要物理学(大气、海洋)、化学(碳循环、土壤)、生物学(分解、迁移)、工程学(可再生能源、碳捕获)、经济(工业、能源使用)、人性(愚蠢、贪婪)和政治(更加愚蠢、更加贪婪)。如上所述,机器将能够获得大量证据来支持所有这些模型。它将能够建议或开展新的实验和探险,以缩小不可避免的不确定性——例如,发现浅海水库中天然气水合物的真实范围。它将能够考虑各种可能的政策建议——法律、推动、市场、发明和地球工程干预——但当然,它还需要找到说服我们接受它们的方法。

The machine might work up a slight sweat when devising policy recommendations to prevent global warming. Earth systems modeling requires knowledge of physics (atmosphere, oceans), chemistry (carbon cycle, soils), biology (decomposition, migration), engineering (renewable energy, carbon capture), economics (industry, energy use), human nature (stupidity, greed), and politics (even more stupidity, even more greed). As noted, the machine will have access to vast quantities of evidence to feed all these models. It will be able to suggest or carry out new experiments and expeditions to narrow down the inevitable uncertainties—for example, to discover the true extent of gas hydrates in shallow ocean reservoirs. It will be able to consider a vast range of possible policy recommendations—laws, nudges, markets, inventions, and geoengineering interventions—but of course it will also need to find ways to persuade us to go along with them.

超级智能的局限性

The Limits of Superintelligence

在发挥想象力时,不要想得太远。一个常见的错误是将神一般的全知能力归因于超级智能 AI 系统——不仅对现在而且对未来都有着完整而完美的了解。52是相当难以置信的,因为它需要一种非物理的能力来确定世界的确切当前状态,以及一种无法实现的能力,以比实时快得多的速度模拟包括机器本身在内的世界的运行(更不用说数十亿个大脑,这仍然是宇宙中第二复杂的物体)。

While stretching your imagination, don’t stretch it too far. A common mistake is to attribute godlike powers of omniscience to superintelligent AI systems—complete and perfect knowledge not just of the present but also of the future.52 This is quite implausible because it requires an unphysical ability to determine the exact current state of the world as well as an unrealizable ability to simulate, much faster than real time, the operation of a world that includes the machine itself (not to mention billions of brains, which would still be the second-most-complex objects in the universe).

这并不是说不可能以合理的确定性预测未来的某些方面——例如,我知道一年后我将在伯克利的哪个教室教什么课,尽管混沌理论家对蝴蝶翅膀之类的问题提出异议。(我也不认为人类能够像物理定律允许的那样准确预测未来!)预测取决于正确的抽象——例如,我可以预测“我”将在 4 月的最后一个星期二在伯克利校园的“惠勒礼堂”上台,但我无法预测我的确切时间。精确到毫米的位置,或者到那时哪些碳原子将被纳入我的体内。

This is not to say that it is impossible to predict some aspects of the future with a reasonable degree of certainty—for example, I know what class I’ll be teaching in what room at Berkeley almost a year from now, despite the protestations of chaos theorists about butterfly wings and all that. (Nor do I think that humans are anywhere close to predicting the future as well as the laws of physics allow!) Prediction depends on having the right abstractions—for example, I can predict that “I” will be “on stage in Wheeler Auditorium” on the Berkeley campus on the last Tuesday in April, but I cannot predict my exact location down to the millimeter or which atoms of carbon will have been incorporated into my body by then.

现实世界对机器获取新知识的速度也施加了一定的限制——这是凯文·凯利 (Kevin Kelly) 在关于超人人工智能的过于简单化的预测的文章中提出的有效观点之一。53例如,要确定某种特定药物能否治愈实验动物的某种癌症,科学家(无论是人类还是机器)有两种选择:给动物注射药物并等待数周,或运行足够精确的模拟。然而,运行模拟需要大量的生物学经验知识,其中一些知识目前尚不可用;因此,必须先进行更多的模型构建实验。毫无疑问,这些需要时间,而且必须在现实世界中完成。

Machines are also subject to certain speed limits imposed by the real world on the rate at which new knowledge of the world can be acquired—one of the valid points made by Kevin Kelly in his article on oversimplified predictions about superhuman AI.53 For example, to determine whether a specific drug cures a certain kind of cancer in an experimental animal, a scientist—human or machine—has two choices: inject the animal with the drug and wait several weeks or run a sufficiently accurate simulation. To run a simulation, however, requires a great deal of empirical knowledge of biology, some of which is currently unavailable; so, more model-building experiments would have to be done first. Undoubtedly, these would take time and must be done in the real world.

另一方面,机器科学家可以同时运行大量的模型构建实验,将其结果整合成一个内部一致(尽管非常复杂)的模型,并将模型的预测与生物学已知的全部实验证据进行比较。此外,模拟模型并不一定需要对整个生物体进行量子力学模拟,直至单个分子反应的水平——正如凯利指出的那样,这比在现实世界中进行实验需要更多时间。就像我可以肯定地预测我在四月星期二的未来位置一样,生物系统的属性也可以用抽象模型准确预测。(除其他原因外,这是因为生物学采用基于聚合反馈回路的强大控制系统,因此初始条件的微小变化通常不会导致结果的巨大变化。)因此,虽然经验科学中不太可能出现即时的机器发现,但我们可以预期,在机器的帮助下,科学将发展得更快。事实上,它已经实现了。

On the other hand, a machine scientist could run vast numbers of model-building experiments in parallel, could integrate their outcomes into an internally consistent (albeit very complex) model, and could compare the model’s predictions with the entirety of experimental evidence known to biology. Moreover, simulating the model does not necessarily require a quantum-mechanical simulation of the entire organism down to the level of individual molecular reactions—which, as Kelly points out, would take more time than simply doing the experiment in the real world. Just as I can predict my future location on Tuesdays in April with some certainty, properties of biological systems can be predicted accurately with abstract models. (Among other reasons, this is because biology operates with robust control systems based on aggregate feedback loops, so that small variations in initial conditions usually don’t lead to large variations in outcomes.) Thus, while instantaneous machine discoveries in the empirical sciences are unlikely, we can expect that science will proceed much faster with the help of machines. Indeed, it already is.

机器的最后一个限制是它们不是人类。这使得在试图对某一类特定对象进行建模和预测时,它们处于一种固有的劣势:人类。我们的大脑都非常相似,所以我们可以用它们来模拟——如果你愿意的话,可以体验——他人的精神和情感生活。这对我们来说是免费的。(如果你仔细想想,机器彼此之间甚至有更大的优势:它们实际上可以运行彼此的代码!)例如,我不需要成为神经感觉系统专家就可以知道用锤子敲击拇指的感觉。我可以用锤子敲击拇指另一方面,机器必须从头开始理解人类:它们只能接触到我们的外部行为,以及所有的神经科学和心理学文献,并且必须在此基础上了解我们的工作方式。原则上,它们能够做到这一点,但可以合理地假设,获得对人类的人类水平或超人的理解将比大多数其他能力花费更长的时间。

A final limitation of machines is that they are not human. This puts them at an intrinsic disadvantage when trying to model and predict one particular class of objects: humans. Our brains are all quite similar, so we can use them to simulate—to experience, if you will—the mental and emotional lives of others. This, for us, comes for free. (If you think about it, machines have an even greater advantage with each other: they can actually run each other’s code!) For example, I don’t need to be an expert on neural sensory systems to know what it feels like when you hit your thumb with a hammer. I can just hit my thumb with a hammer. Machines, on the other hand, have to start almost54 from scratch in their understanding of humans: they have access only to our external behavior, plus all the neuroscience and psychology literature, and have to develop an understanding of how we work on that basis. In principle, they will be able to do this, but it’s reasonable to suppose that acquiring a human-level or superhuman understanding of humans will take them longer than most other capabilities.

人工智能将如何造福人类?

How Will AI Benefit Humans?

我们的智慧造就了我们的文明。有了更强大的智慧,我们就能拥有更强大甚至更好的文明。人们可以推测如何解决一些重大的未解问题,比如无限期延长人类寿命或开发超光速旅行,但这些科幻小说中的主要内容还不是人工智能进步的驱动力。(有了超级智能,我们可能就能发明各种类似魔法的技术,但现在很难说这些技术会是什么。)相反,我们不妨考虑一个更平淡无奇的目标:以可持续的方式提高地球上每个人的生活水平,使其达到发达国家相当可观的水平。选择(有点武断地)可观来表示美国的 88% 水平,这一既定目标意味着全球国内生产总值 (GDP) 几乎增长十倍,从每年 76 万亿美元增加到 750 万亿美元。55

Our intelligence is responsible for our civilization. With access to greater intelligence we could have a greater—and perhaps far better—civilization. One can speculate about solving major open problems such as extending human life indefinitely or developing faster-than-light travel, but these staples of science fiction are not yet the driving force for progress in AI. (With superintelligent AI, we’ll probably be able to invent all sorts of quasi-magical technologies, but it’s hard to say now what those might be.) Consider, instead, a far more prosaic goal: raising the living standard of everyone on Earth, in a sustainable way, to a level that would be viewed as quite respectable in a developed country. Choosing (somewhat arbitrarily) respectable to mean the eighty-eighth percentile in the United States, the stated goal represents almost a tenfold increase in global gross domestic product (GDP), from $76 trillion to $750 trillion per year.55

为了计算此类奖金的现金价值,经济学家使用收入流的净现值,该值考虑了未来收入相对于现值的折现。每年 674 万亿美元的额外收入的净现值约为 13,500 万亿美元,56假设折现率为 5%。因此,从非常粗略的角度来看,如果人类水平的人工智能能够为每个人提供可观的生活水平,那么这个数字就是人类水平的人工智能可能值多少钱。有这样的数字,公司和国家每年在人工智能研发上投资数百亿美元也就不足为奇了。57即便如此,与奖金数额相比,投资金额微不足道。

To calculate the cash value of such a prize, economists use the net present value of the income stream, which takes into account the discounting of future income relative to the present. The extra income of $674 trillion per year has a net present value of roughly $13,500 trillion,56 assuming a discount factor of 5 percent. So, in very crude terms, this is a ballpark figure for what human-level AI might be worth if it can deliver a respectable living standard for everyone. With numbers like this, it’s not surprising that companies and countries are investing tens of billions of dollars annually in AI research and development.57 Even so, the sums invested are minuscule compared to the size of the prize.

当然,除非你对人类级别的人工智能如何实现提高生活水平这一壮举有所了解,否则这些都是虚构的数字。它只能通过增加人均商品和服务产量来实现这一目标。换句话说:普通人永远不可能期望消费超过普通人生产的商品和服务。本章前面讨论的自动驾驶出租车的例子说明了人工智能的乘数效应:有了自动化服务,(比如说)十个人应该能够管理一千辆车的车队,因此每个人生产的交通量是以前的一百倍。制造汽车和提取制造汽车的原材料也是如此。事实上,澳大利亚北部的一些铁矿石开采作业已经几乎完全自动化,那里的温度经常超过 45 摄氏度(113 华氏度)。58

Of course, these are all made-up numbers unless one has some idea of how human-level AI could achieve the feat of raising living standards. It can do this only by increasing the per-capita production of goods and services. Put another way: the average human can never expect to consume more than the average human produces. The example of self-driving taxis discussed earlier in the chapter illustrates the multiplier effect of AI: with an automated service, it should be possible for (say) ten people to manage a fleet of one thousand vehicles, so each person is producing one hundred times as much transportation as before. The same goes for manufacturing the cars and for extracting the raw materials from which the cars are made. Indeed, some iron-ore mining operations in northern Australia, where temperatures regularly exceed 45 degrees Celsius (113 degrees Fahrenheit), are almost completely automated already.58

当今人工智能的应用都是专用系统:自动驾驶汽车和自动运营矿山需要在研究、机械设计、软件工程和测试方面投入巨额资金,以开发必要的算法并确保它们按预期工作。这就是工程领域所有工作的方式。个人旅行过去也是如此:如果你想在 17 世纪从欧洲旅行到澳大利亚再返回,这将涉及一个耗资巨大的庞大项目花费大量金钱,需要多年的规划,而且死亡风险很高。现在我们已经习惯了交通即服务 (TaaS) 的概念:如果你需要在下周初到达墨尔本,只需要在手机上点击几下,并支付相对少量的钱。

These present-day applications of AI are special-purpose systems: self-driving cars and self-operating mines have required huge investments in research, mechanical design, software engineering, and testing to develop the necessary algorithms and to make sure that they work as intended. That’s just how things are done in all spheres of engineering. That’s how things used to be done in personal travel too: if you wanted to travel from Europe to Australia and back in the seventeenth century, it would have involved a huge project costing vast sums of money, requiring years of planning, and carrying a high risk of death. Now we are used to the idea of transportation as a service (TaaS): if you need to be in Melbourne early next week, it just requires a few taps on your phone and a relatively minuscule amount of money.

通用人工智能将是一切皆服务(EaaS)。无需雇用大批不同学科的专家,组织成承包商和分包商的层级结构,即可完成项目。所有通用人工智能的化身都可以使用人类的所有知识和技能,甚至更多。唯一的区别在于物理能力:用于建筑或手术的灵巧腿式机器人、用于大规模货物运输的轮式机器人、用于空中检查的四轴飞行器机器人等等。原则上——政治和经济除外——每个人都可以拥有一个由软件代理和物理机器人组成的整个组织,能够设计和建造桥梁、提高农作物产量、为 100 名客人做饭、进行选举或做任何其他需要做的事情。正是通用智能的通用性使这成为可能。

General-purpose AI would be everything as a service (EaaS). There would be no need to employ armies of specialists in different disciplines, organized into hierarchies of contractors and subcontractors, in order to carry out a project. All embodiments of general-purpose AI would have access to all the knowledge and skills of the human race, and more besides. The only differentiation would be in the physical capabilities: dexterous legged robots for construction or surgery, wheeled robots for large-scale goods transportation, quadcopter robots for aerial inspections, and so on. In principle—politics and economics aside—everyone could have at their disposal an entire organization composed of software agents and physical robots, capable of designing and building bridges, improving crop yields, cooking dinner for a hundred guests, running elections, or doing whatever else needs doing. It’s the generality of general-purpose intelligence that makes this possible.

当然,历史已经表明,即使没有人工智能,全球人均 GDP 也能增长十倍——只是实现这一增长花了 190 年时间(从 1820 年到 2010 年)。59需要工厂、机床、自动化、铁路、钢铁、汽车、飞机、电力、石油和天然气生产、电话、广播、电视、计算机、互联网、卫星和许多其他革命性发明的发展。前几段中提出的 GDP 十倍增长并非取决于进一步的革命性技术,而取决于人工智能系统能够更有效地、更大规模地利用我们已有的技术。

History has shown, of course, that a tenfold increase in global GDP per capita is possible without AI—it’s just that it took 190 years (from 1820 to 2010) to achieve that increase.59 It required the development of factories, machine tools, automation, railways, steel, cars, airplanes, electricity, oil and gas production, telephones, radio, television, computers, the Internet, satellites, and many other revolutionary inventions. The tenfold increase in GDP posited in the preceding paragraphs is predicated not on further revolutionary technologies but on the ability of AI systems to employ what we already have more effectively and at greater scale.

当然,除了提高生活水平这一纯粹的物质利益之外,还会有其他影响。例如,众所周知,个人辅导比课堂教学更有效,但如果由人类来做,对于绝大多数人来说,这根本负担不起——而且永远如此。大多数人都无法想象人工智能会给孩子带来巨大的经济负担。有了人工智能辅导,每个孩子,无论多么贫穷,都能发挥出自己的潜力。每个孩子的成本可以忽略不计,而这个孩子的生活将更加富裕、更加富有成效。无论是个人还是集体,对艺术和智力的追求都将成为生活的正常组成部分,而不是一种稀有的奢侈品。

Of course, there will be effects besides the purely material benefit of raising living standards. For example, personal tutoring is known to be far more effective than classroom teaching, but when done by humans it is simply unaffordable—and always will be—for the vast majority of people. With AI tutors, the potential of each child, no matter how poor, can be realized. The cost per child would be negligible, and that child would live a far richer and more productive life. The pursuit of artistic and intellectual endeavors, whether individually or collectively, would be a normal part of life rather than a rarefied luxury.

在健康领域,人工智能系统应能帮助研究人员解开和掌握人类生物学的复杂性,从而逐渐消除疾病。对人类心理学和神经化学的深入了解应能广泛改善心理健康。

In the area of health, AI systems should enable researchers to unravel and master the vast complexities of human biology and thereby gradually banish disease. Greater insights into human psychology and neurochemistry should lead to broad improvements in mental health.

或许更为不同寻常的是,人工智能可以为虚拟现实 (VR) 提供更为有效的创作工具,并可以在 VR 环境中填充更为有趣的实体。这可能会使 VR 成为文学和艺术表达的首选媒介,创造出目前难以想象的丰富而深刻的体验。

Perhaps more unconventionally, AI could enable far more effective authoring tools for virtual reality (VR) and could populate VR environments with far more interesting entities. This might turn VR into the medium of choice for literary and artistic expression, creating experiences of a richness and depth that is currently unimaginable.

在日常生活中,如果设计合理,且不被经济和政治利益所左右,一个智能助手和向导将使每个人能够在日益复杂、有时充满敌意的经济和政治体系中有效地为自己采取行动。实际上,你将随时拥有一名强大的律师、会计师和政治顾问。正如人们希望通过混合使用哪怕是一小部分自动驾驶汽车来缓解交通拥堵一样,我们只能希望,更明智、更明智的全球公民将制定出更明智的政策,减少冲突。

And in the mundane world of daily life, an intelligent assistant and guide would—if well designed and not co-opted by economic and political interests—empower every individual to act effectively on their own behalf in an increasingly complex and sometimes hostile economic and political system. You would, in effect, have a high-powered lawyer, accountant, and political adviser on call at any time. Just as traffic jams are expected to be smoothed out by intermixing even a small percentage of autonomous vehicles, one can only hope that wiser policies and fewer conflicts will emerge from a better-informed and better-advised global citizenry.

这些发展结合在一起可能会改变历史的动态——至少是历史中由社会内部和社会之间为获得生活必需品而发生的冲突所驱动的那部分。如果蛋糕本质上是无限的,那么与他人争夺更大的份额就毫无意义了。这就像争夺谁能获得最多的报纸数字版一样——当任何人都可以免费制作任意数量的数字版时,这完全毫无意义。

These developments taken together could change the dynamic of history—at least that part of history that has been driven by conflicts within and between societies for access to the wherewithal of life. If the pie is essentially infinite, then fighting others for a larger share makes little sense. It would be like fighting over who gets the most digital copies of the newspaper—completely pointless when anyone can make as many digital copies as they want for free.

人工智能所能提供的东西也有一些限制。土地和原材料不是无限的,因此人口增长不可能无限,也不是每个人都能在私人公园里拥有豪宅。(这最终将需要在太阳系其他地方进行采矿并在太空建造人工栖息地;但我保证不谈论科幻小说。)骄傲的蛋糕也是有限的:只有 1% 的人可以在任何给定指标上进入前 1%。如果人类幸福需要进入前 1%,那么 99% 的人都会不幸福,即使底层 1% 的人过着客观上辉煌的生活。60因此,对于我们的文化来说,逐渐降低骄傲和嫉妒作为感知自我价值的核心要素的权重将非常重要。

There are some limits to what AI can provide. The pies of land and raw materials are not infinite, so there cannot be unlimited population growth and not everyone will have a mansion in a private park. (This will eventually necessitate mining elsewhere in the solar system and constructing artificial habitats in space; but I promised not to talk about science fiction.) The pie of pride is also finite: only 1 percent of people can be in the top 1 percent on any given metric. If human happiness requires being in the top 1 percent, then 99 percent of humans are going to be unhappy, even when the bottom 1 percent has an objectively splendid lifestyle.60 It will be important, then, for our cultures to gradually down-weight pride and envy as central elements of perceived self-worth.

正如尼克·博斯特罗姆在其著作《超级智能》的结尾所说,人工智能的成功将产生“一种文明轨迹,使人类能够以富有同情心和欢乐的方式利用宇宙的天赋。”如果我们无法利用人工智能所提供的优势,那只能怪我们自己。

As Nick Bostrom puts it at the end of his book Superintelligence, success in AI will yield “a civilizational trajectory that leads to a compassionate and jubilant use of humanity’s cosmic endowment.” If we fail to take advantage of what AI has to offer, we will have only ourselves to blame.

4

4

人工智能的滥用

MISUSES OF AI

以富有同情心和欢乐的方式利用人类的宇宙天赋听起来很棒,但我们也必须考虑到不法行为领域的快速创新。心怀恶意的人正在迅速想出滥用人工智能的新方法,以至于本章可能在印刷出来之前就过时了。然而,不要把它看作是令人沮丧的读物,而要把它看作是在为时已晚之前采取行动的呼吁。

A compassionate and jubilant use of humanity’s cosmic endowment sounds wonderful, but we also have to reckon with the rapid rate of innovation in the malfeasance sector. Ill-intentioned people are thinking up new ways to misuse AI so quickly that this chapter is likely to be outdated even before it attains printed form. Think of it not as depressing reading, however, but as a call to act before it is too late.

监视、劝说和控制

Surveillance, Persuasion, and Control

自动化的史塔西

The automated Stasi

东德国家安全部(通常称为斯塔西)被广泛认为是“有史以来最有效和最专制的情报和秘密警察机构之一”。1保存了绝大多数东德家庭的档案。它监听电话、阅读信件,并在公寓和酒店安装隐藏摄像头。它在识别和消除异见活动方面非常有效。它首选的作案手法是是心理上的毁灭,而不是监禁或处决。然而,这种程度的控制代价高昂:据估计,超过四分之一的劳动年龄成年人是史塔西的线人。史塔西的纸质记录估计有 200 亿页2,处理和处理大量传入信息的任务开始超出任何人类组织的能力。

The Ministerium für Staatsicherheit of East Germany, more commonly known as the Stasi, is widely regarded as “one of the most effective and repressive intelligence and secret police agencies to have ever existed.”1 It maintained files on the great majority of East German households. It monitored phone calls, read letters, and planted hidden cameras in apartments and hotels. It was ruthlessly effective at identifying and eliminating dissident activity. Its preferred modus operandi was psychological destruction rather than imprisonment or execution. This level of control came at great cost, however: by some estimates, more than a quarter of working-age adults were Stasi informants. Stasi paper records have been estimated at twenty billion pages2 and the task of processing and acting on the huge incoming flows of information began to exceed the capacity of any human organization.

因此,情报机构已经发现了在工作中使用人工智能的潜力,这不足为奇。多年来,他们一直在应用简单形式的人工智能技术,包括语音识别和语音和文本中关键词和短语的识别。人工智能系统越来越能够理解人们所说和所做的事情的内容,无论是在语音、文本还是视频监控中。在采用这种技术进行控制的政权中,就好像每个公民都有自己的斯塔西特工全天 24 小时监视他们一样。3

It should come as no surprise, then, that intelligence agencies have spotted the potential for using AI in their work. For many years, they have been applying simple forms of AI technology, including voice recognition and identification of key words and phrases in both speech and text. Increasingly, AI systems are able to understand the content of what people are saying and doing, whether in speech, text, or video surveillance. In regimes where this technology is adopted for the purposes of control, it will be as if every citizen had their own personal Stasi operative watching over them twenty-four hours a day.3

即使在相对自由的国家的民间,我们也受到越来越有效的监控。公司收集和出售有关我们的购买、互联网和社交网络使用情况、电器使用情况、通话和短信记录、就业和健康的信息。我们的位置可以通过手机和联网汽车进行跟踪。摄像头可以在街上识别我们的面孔。所有这些数据以及更多数据都可以由智能信息集成系统拼凑起来,从而形成一幅相当完整的画面,展示我们每个人在做什么、我们如何生活、我们喜欢和不喜欢谁以及我们将如何投票。4相比之下史塔西看起来就像业余人士。

Even in the civilian sphere, in relatively free countries, we are subject to increasingly effective surveillance. Corporations collect and sell information about our purchases, Internet and social network usage, electrical appliance usage, calling and texting records, employment, and health. Our locations can be tracked through our cell phones and our Internet-connected cars. Cameras recognize our faces on the street. All this data, and much more, can be pieced together by intelligent information integration systems to produce a fairly complete picture of what each of us is doing, how we live our lives, who we like and dislike, and how we will vote.4 The Stasi will look like amateurs by comparison.

控制你的行为

Controlling your behavior

一旦有了监控能力,下一步就是修改你的行为,以适应那些部署这项技术的人。一种相当粗暴的方法是自动化、个性化的勒索:一个了解你在做什么的系统——无论是通过监听、阅读,或监视你——可以很容易地发现你不应该做的事情。一旦它发现了什么,它就会与你通信,以榨取尽可能多的钱财(或者胁迫你做出行为,如果目标是政治控制或间谍活动)。榨取金钱是强化学习算法的完美奖励信号,因此我们可以期待人工智能系统在识别不当行为和从中获利的能力方面迅速提高。2015 年初,我向一位计算机安全专家建议,由强化学习驱动的自动勒索系统可能很快就会成为现实;他笑着说这已经发生了。第一个被广泛宣传的勒索机器人是 Delilah,于 2016 年 7 月被发现。5

Once surveillance capabilities are in place, the next step is to modify your behavior to suit those who are deploying this technology. One rather crude method is automated, personalized blackmail: a system that understands what you are doing—whether by listening, reading, or watching you—can easily spot things you should not be doing. Once it finds something, it will enter into correspondence with you to extract the largest possible amount of money (or to coerce behavior, if the goal is political control or espionage). The extraction of money works as the perfect reward signal for a reinforcement learning algorithm, so we can expect AI systems to improve rapidly in their ability to identify and profit from misbehavior. Early in 2015, I suggested to a computer security expert that automated blackmail systems, driven by reinforcement learning, might soon become feasible; he laughed and said it was already happening. The first blackmail bot to be widely publicized was Delilah, identified in July 2016.5

改变人们行为的更微妙的方法是改变他们的信息环境,使他们相信不同的事情并做出不同的决定。当然,广告商几个世纪以来一直在这样做,以此来改变个人的购买行为。宣传作为战争和政治统治的工具有着更长的历史。

A more subtle way to change people’s behavior is to modify their information environment so that they believe different things and make different decisions. Of course, advertisers have been doing this for centuries as a way of modifying the purchasing behavior of individuals. Propaganda as a tool of war and political domination has an even longer history.

那么现在有什么不同呢?首先,由于人工智能系统可以跟踪个人的在线阅读习惯、偏好和可能的知识状态,因此它们可以定制特定的信息,以最大限度地影响个人,同时将信息不被相信的风险降至最低。其次,人工智能系统知道个人是否阅读了信息,他们花了多长时间阅读信息,以及他们是否点击了信息中的其他链接。然后,它利用这些信号作为对其试图影响每个人的成功或失败的即时反馈;通过这种方式,它很快就学会了在工作中变得更有效。这就是社交媒体上的内容选择算法对政治观点产生潜在影响的方式。

So what’s different now? First, because AI systems can track an individual’s online reading habits, preferences, and likely state of knowledge, they can tailor specific messages to maximize impact on that individual while minimizing the risk that the information will be disbelieved. Second, the AI system knows whether the individual reads the message, how long they spend reading it, and whether they follow additional links within the message. It then uses these signals as immediate feedback on the success or failure of its attempt to influence each individual; in this way, it quickly learns to become more effective in its work. This is how content selection algorithms on social media have had their insidious effect on political opinions.

另一个最近的变化是,人工智能、计算机图形学和语音合成的结合使得生成深度伪造视频成为可能——几乎任何人说或做的事情的逼真的视频和音频内容。这项技术只需要对所需事件进行口头描述,就可以使用几乎世界上任何人都无法相信。参议员 X 在可疑场所 Z 接受可卡因贩子 Y 贿赂的手机视频?没问题!这种内容可以诱使人们对从未发生的事情产生不可动摇的信念。6此外,人工智能系统可以生成数百万个虚假身份(即所谓的机器人大军),这些机器人每天可以发出数十亿条评论、推文和推荐,淹没人类交换真实信息的努力。eBay、淘宝和亚马逊等在线市场依靠信誉系统7 来建立买家和卖家之间的信任,它们不断与旨在破坏市场的机器人大军作战。

Another recent change is that the combination of AI, computer graphics, and speech synthesis is making it possible to generate deepfakes—realistic video and audio content of just about anyone saying or doing just about anything. The technology will require little more than a verbal description of the desired event, making it usable by more or less anyone in the world. Cell phone video of Senator X accepting a bribe from cocaine dealer Y at shady establishment Z? No problem! This kind of content can induce unshakeable beliefs in things that never happened.6 In addition, AI systems can generate millions of false identities—the so-called bot armies—that can pump out billions of comments, tweets, and recommendations daily, swamping the efforts of mere humans to exchange truthful information. Online marketplaces such as eBay, Taobao, and Amazon that rely on reputation systems7 to build trust between buyers and sellers are constantly at war with bot armies designed to corrupt the markets.

最后,如果政府能够根据行为实施奖惩,控制方法就可以变得直接。这样的系统将人们视为强化学习算法,训练他们以优化国家设定的目标。政府,特别是具有自上而下的工程思维的政府,很容易做出以下推理:如果每个人都表现良好、有爱国态度、为国家进步做出贡献,那就更好了;技术可以衡量个人的行为、态度和贡献;因此,如果我们建立一个基于技术的奖惩监控系统,每个人都会过得更好。

Finally, methods of control can be direct if a government is able to implement rewards and punishments based on behavior. Such a system treats people as reinforcement learning algorithms, training them to optimize the objective set by the state. The temptation for a government, particularly one with a top-down, engineering mind-set, is to reason as follows: it would be better if everyone behaved well, had a patriotic attitude, and contributed to the progress of the country; technology enables measurement of individual behavior, attitudes, and contributions; therefore, everyone will be better off if we set up a technology-based system of monitoring and control based on rewards and punishments.

这种思路有几个问题。首先,它忽略了在侵入性监控和胁迫制度下生活的心理成本;用外表的和谐掩盖内心的痛苦绝不是一种理想状态。每一次善举都不再是善举,而是成为个人得分最大化的行为,并被接受者视为如此。或者更糟的是,自愿行善的概念本身逐渐成为人们过去所做的事情的逐渐消逝的记忆。在这样的制度下,探望住院的生病朋友,其道德意义和情感价值并不比在红灯前停车更有价值。其次,该计划与人工智能的标准模型一样,成为同样的失败模式的牺牲品,因为它假设既定目标实际上是真正的潜在目标。不可避免地,古德哈特定律将占据主导地位,即个人优化对外在行为的官方衡量标准,就像大学已经学会优化大学排名系统使用的“质量”的“客观”衡量标准,而不是提高其真实(但未衡量)的质量一样。8最后,强加统一的行为美德衡量标准忽略了一个事实,即一个成功的社会可能由各种各样的个人组成,每个人都以自己的方式做出贡献。

There are several problems with this line of thinking. First, it ignores the psychic cost of living under a system of intrusive monitoring and coercion; outward harmony masking inner misery is hardly an ideal state. Every act of kindness ceases to be an act of kindness and becomes instead an act of personal score maximization and is perceived as such by the recipient. Or worse, the very concept of a voluntary act of kindness gradually becomes just a fading memory of something people used to do. Visiting an ailing friend in hospital will, under such a system, have no more moral significance and emotional value than stopping at a red light. Second, the scheme falls victim to the same failure mode as the standard model of AI, in that it assumes that the stated objective is in fact the true, underlying objective. Inevitably, Goodhart’s law will take over, whereby individuals optimize the official measure of outward behavior, just as universities have learned to optimize the “objective” measures of “quality” used by university ranking systems instead of improving their real (but unmeasured) quality.8 Finally, the imposition of a uniform measure of behavioral virtue misses the point that a successful society may comprise a wide variety of individuals, each contributing in their own way.

精神安全权

A right to mental security

人类文明的伟大成就之一就是人身安全的逐步改善。我们大多数人可以期待在日常生活中不再担心受伤和死亡。1948 年《世界人权宣言》第 3 条规定:“人人有权享有生命、自由和人身安全。”

One of the great achievements of civilization has been the gradual improvement in physical security for humans. Most of us can expect to conduct our daily lives without constant fear of injury and death. Article 3 of the 1948 Universal Declaration of Human Rights states, “Everyone has the right to life, liberty and security of person.”

我想说的是,每个人都应该有心理安全的权利,即生活在一个基本真实的信息环境中的权利。人类倾向于相信我们眼睛和耳朵的证据。我们相信我们的家人、朋友、老师和(一些)媒体会告诉我们他们认为是真相的东西。尽管我们不指望二手车销售人员和政客告诉我们真相,但我们很难相信他们有时会如此厚颜无耻地撒谎。因此,我们极易受到虚假信息技术的攻击。

I would like to suggest that everyone should also have the right to mental security—the right to live in a largely true information environment. Humans tend to believe the evidence of our eyes and ears. We trust our family, friends, teachers, and (some) media sources to tell us what they believe to be the truth. Even though we do not expect used-car salespersons and politicians to tell us the truth, we have trouble believing that they are lying as brazenly as they sometimes do. We are, therefore, extremely vulnerable to the technology of misinformation.

精神安全权似乎并未被写入《世界人权宣言》。第 18 条和第 19 条确立了“思想自由”和“意见和言论自由”的权利。当然,一个人的思想和意见部分是由其信息环境形成的,而信息环境又受第 19 条的“通过任何媒体和不分国界传播信息和思想的权利”的约束。也就是说,世界上任何地方的任何人都有权向你传播虚假信息。而困难就在这里:民主国家,尤其是美国,在大多数情况下由于担心政府控制言论,人们不愿意(或者从宪法上讲无法)阻止公众关注的虚假信息传播。民主国家似乎天真地相信真相最终会胜出,而不是追求没有真实信息就没有思想自由的理念,这种信任让我们失去了保护。德国是个例外;它最近通过了《网络执行法案》,要求内容平台删除被禁止的仇恨言论和虚假新闻,但这项法案受到了相当多的批评,被认为是不可行和不民主的。9

The right to mental security does not appear to be enshrined in the Universal Declaration. Articles 18 and 19 establish the rights of “freedom of thought” and “freedom of opinion and expression.” One’s thoughts and opinions are, of course, partly formed by one’s information environment, which, in turn, is subject to Article 19’s “right to . . . impart information and ideas through any media and regardless of frontiers.” That is, anyone, anywhere in the world, has the right to impart false information to you. And therein lies the difficulty: democratic nations, particularly the United States, have for the most part been reluctant—or constitutionally unable—to prevent the imparting of false information on matters of public concern because of justifiable fears regarding government control of speech. Rather than pursuing the idea that there is no freedom of thought without access to true information, democracies seem to have placed a naïve trust in the idea that the truth will win out in the end, and this trust has left us unprotected. Germany is an exception; it recently passed the Network Enforcement Act, which requires content platforms to remove proscribed hate speech and fake news, but this has come under considerable criticism as being unworkable and undemocratic.9

因此,就目前而言,我们可以预见到我们的心理安全将继续受到攻击,主要通过商业和志愿者的努力来保护。这些努力包括 factcheck.org 和 snopes.com 等事实核查网站——但当然,其他“事实核查”网站正在涌现,将真相宣称为谎言,将谎言宣称为真相。

For the time being, then, we can expect our mental security to remain under attack, protected mainly by commercial and volunteer efforts. These efforts include fact-checking sites such as factcheck.org and snopes.com—but of course other “fact-checking” sites are springing up to declare truth as lies and lies as truth.

谷歌和 Facebook 等主要信息公用事业公司在欧洲和美国面临着“采取行动”的巨大压力。他们正在尝试使用人工智能和人工筛选器来标记或删除虚假内容,并将用户引导至经过验证的来源,以抵消虚假信息的影响。归根结底,所有这些努力都依赖于循环声誉系统,即消息来源之所以值得信赖,是因为可信消息来源报告它们值得信赖。如果传播了足够多的虚假信息,这些声誉系统就会失效:真正值得信赖的消息来源可能会变得不可信,反之亦然,正如今天美国 CNN 和 Fox News 等主要媒体消息来源所发生的情况一样。致力于打击虚假信息的技术专家 Aviv Ovadya 将此称为“信息末日——思想市场的灾难性失败” 。10

The major information utilities such as Google and Facebook have come under extreme pressure in Europe and the United States to “do something about it.” They are experimenting with ways to flag or relegate false content—using both AI and human screeners—and to direct users to verified sources that counteract the effects of misinformation. Ultimately, all such efforts rely on circular reputation systems, in the sense that sources are trusted because trusted sources report them to be trustworthy. If enough false information is propagated, these reputation systems can fail: sources that are actually trustworthy can become untrusted and vice versa, as appears to be occurring today with major media sources such as CNN and Fox News in the United States. Aviv Ovadya, a technologist working against misinformation, has called this the “infopocalypse—a catastrophic failure of the marketplace of ideas.”10

保护声誉系统运作的一种方法是注入尽可能接近事实的来源。一个肯定真实的事实可以推翻任何数量的来源只有当这些来源传播的信息与已知事实相反时,它们才具有一定的可信度。在许多国家,公证人充当事实真相的来源,以维护法律和房地产信息的完整性;他们通常是任何交易中公正的第三方,并由政府或专业协会授权。(在伦敦金融城,公证人同业公会自 1373 年以来一直这样做,这表明说真话的角色具有一定的稳定性。)如果事实核查员有了正式的标准、专业资格和许可程序,这往往会保持我们所依赖的信息流的有效性。W3C 可信网络小组和可信联盟等组织旨在开发评估信息提供者的技术和众包方法,然后让用户过滤掉不可靠的来源。

One way to protect the functioning of reputation systems is to inject sources that are as close as possible to ground truth. A single fact that is certainly true can invalidate any number of sources that are only somewhat trustworthy, if those sources disseminate information contrary to the known fact. In many countries, notaries function as sources of ground truth to maintain the integrity of legal and real-estate information; they are usually disinterested third parties in any transaction and are licensed by governments or professional societies. (In the City of London, the Worshipful Company of Scriveners has been doing this since 1373, suggesting that a certain stability inheres in the role of truth telling.) If formal standards, professional qualifications, and licensing procedures emerge for fact-checkers, that would tend to preserve the validity of the information flows on which we depend. Organizations such as the W3C Credible Web group and the Credibility Coalition aim to develop technological and crowdsourcing methods for evaluating information providers, which would then allow users to filter out unreliable sources.

保护声誉系统的第二种方法是对传播虚假信息的行为收取费用。因此,一些酒店评级网站只接受那些通过该网站预订并支付了该酒店房间的旅客对某家酒店的评论,而其他评级网站则接受任何人的评论。前者的评级偏见程度要低得多,这并不奇怪,因为它们会对虚假评论收取费用(支付不必要的酒店房间费用)。11监管处罚 更具争议性:没有人想要一个真理部,而德国的《网络执法法》只惩罚内容平台,而不惩罚发布虚假新闻的人。另一方面,正如许多国家和美国的许多州将未经许可录制电话视为非法一样,至少应该可以对制作真实人物的虚假音频和视频录音施加处罚。

A second way to protect reputation systems is to impose a cost for purveying false information. Thus, some hotel rating sites accept reviews concerning a particular hotel only from those who have booked and paid for a room at that hotel through the site, while other rating sites accept reviews from anyone. It will come as no surprise that ratings at the former sites are far less biased, because they impose a cost (paying for an unnecessary hotel room) for fraudulent reviews.11 Regulatory penalties are more controversial: no one wants a Ministry of Truth, and Germany’s Network Enforcement Act penalizes only the content platform, not the person posting the fake news. On the other hand, just as many nations and many US states make it illegal to record telephone calls without permission, it ought, at least, to be possible to impose penalties for creating fictitious audio and video recordings of real people.

最后,还有两个事实对我们有利。首先,几乎没有人愿意明知故犯地被骗。(这并不是说父母总是会认真调查那些称赞孩子聪明和魅力的人是否诚实;只是他们不太可能向一个一有机会就撒谎的人寻求这样的认可。)这意味着,无论持什么政治观点的人,都有动力采用能帮助他们区分真相和谎言的工具。其次,没有人愿意被视为撒谎者,新闻媒体尤其如此。这意味着信息提供者(至少是那些重视声誉的人)有动力加入行业协会,遵守有利于说真话的行为准则。反过来,社交媒体平台可以为用户提供选择,让他们只看到来自遵守这些准则并接受第三方事实核查的知名来源的内容。

Finally, there are two other facts that work in our favor. First, almost no one actively wants, knowingly, to be lied to. (This is not to say that parents always inquire vigorously into the truthfulness of those who praise their children’s intelligence and charm; it’s just that they are less likely to seek such approval from someone who is known to lie at every opportunity.) This means that people of all political persuasions have an incentive to adopt tools that help them distinguish truth from lies. Second, no one wants to be known as a liar, least of all news outlets. This means that information providers—at least those for who reputation matters—have an incentive to join industry associations and subscribe to codes of conduct that favor truth telling. In turn, social media platforms can offer users the option of seeing content from only reputable sources that subscribe to these codes and subject themselves to third-party fact-checking.

致命自主武器

Lethal Autonomous Weapons

联合国将致命自主武器系统(简称 AWS,因为 LAWS 非常令人困惑)定义为“无需人工干预即可定位、选择和消灭人类目标”的武器系统。AWS 被描述为继火药和核武器之后的“第三次战争革命”,这是有充分理由的。

The United Nations defines lethal autonomous weapons systems (AWS for short, because LAWS is quite confusing) as weapons systems that “locate, select, and eliminate human targets without human intervention.” AWS have been described, with good reason, as the “third revolution in warfare,” after gunpowder and nuclear weapons.

您可能在媒体上读过有关 AWS 的文章;通常文章会称其为杀手机器人,并配上《终结者》电影中的图像。这至少在两个方面具有误导性:首先,它暗示自主武器是一种威胁,因为它们可能会接管世界并摧毁人类;其次,它暗示自主武器将是人形的、有意识的和邪恶的。

You may have read articles in the media about AWS; usually the article will call them killer robots and will be festooned with images from the Terminator movies. This is misleading in at least two ways: first, it suggests that autonomous weapons are a threat because they might take over the world and destroy the human race; second, it suggests that autonomous weapons will be humanoid, conscious, and evil.

媒体对这个问题的报道最终让它看起来像科幻小说。甚至德国政府也被欺骗了:它最近发表了一份声明12,声称“拥有学习和发展自我意识的能力是定义单个功能或武器系统为自主的不可或缺的属性。”(这就像断言导弹只有速度超过光速才能称为导弹一样有道理。)事实上,自主武器的自主程度将与国际象棋相同程序被赋予了赢得游戏的使命,但它自己决定将棋子移动到哪里以及消灭哪些敌方棋子。

The net effect of the media’s portrayal of the issue has been to make it seem like science fiction. Even the German government has been taken in: it recently issued a statement12 asserting that “having the ability to learn and develop self-awareness constitutes an indispensable attribute to be used to define individual functions or weapon systems as autonomous.” (This makes as much sense as asserting that a missile isn’t a missile unless it goes faster than the speed of light.) In fact, autonomous weapons will have the same degree of autonomy as a chess program, which is given the mission of winning the game but decides by itself where to move its pieces and which enemy pieces to eliminate.

图 7:(左)以色列航空工业公司生产的 Harop 巡飞武器;(右)Slaughterbots视频中的静止图像,展示了一种包含小型爆炸驱动射弹的自主武器的可能设计。

FIGURE 7: (left) Harop loitering weapon produced by Israel Aerospace Industries; (right) still image from the Slaughterbots video showing a possible design for an autonomous weapon containing a small, explosive-driven projectile.

AWS 并非科幻小说。它们已经存在。最明显的例子可能是以色列的 Harop(图 7,左),这是一种翼展 10 英尺、弹头重 50 磅的巡飞弹。它可以在给定地理区域内搜索长达 6 小时的任何符合给定标准的目标,然后将其摧毁。标准可能是“发射类似防空雷达的雷达信号”或“看起来像坦克”。

AWS are not science fiction. They already exist. Probably the clearest example is Israel’s Harop (figure 7, left), a loitering munition with a ten-foot wingspan and a fifty-pound warhead. It searches for up to six hours in a given geographical region for any target that meets a given criterion and then destroys it. The criterion could be “emits a radar signal resembling antiaircraft radar” or “looks like a tank.”

通过结合微型四旋翼设计、微型相机、计算机视觉芯片、导航和测绘算法以及检测和跟踪人类的方法方面的最新进展,可以在相当短的时间内部署一种杀伤性武器,如图7 (右)所示的Slaughterbot 13。这种武器可以攻击任何符合某些视觉标准(年龄、性别、制服、肤色等)的人,甚至可以根据面部识别攻击特定的个人。我听说瑞士国防部已经建造并测试了一个真正的 Slaughterbot,并发现,正如预期的那样,这项技术既可行又致命。

By combining recent advances in miniature quadrotor design, miniature cameras, computer vision chips, navigation and mapping algorithms, and methods for detecting and tracking humans, it would be possible in fairly short order to field an antipersonnel weapon like the Slaughterbot13 shown in figure 7 (right). Such a weapon could be tasked with attacking anyone meeting certain visual criteria (age, gender, uniform, skin color, and so on) or even specific individuals based on face recognition. I’m told that the Swiss Defense Department has already built and tested a real Slaughterbot and found that, as expected, the technology is both feasible and lethal.

自 2014 年以来,日内瓦一直在进行外交讨论,可能最终达成一项禁止 AWS 的条约。与此同时,这些讨论的一些主要参与者(美国、中国、俄罗斯(某种程度上还有以色列和英国)正在参与开发自主武器的危险竞争。例如,在美国,CODE(拒止环境下的协同作战)计划旨在通过使无人机在间歇性无线电联系下运行,实现自主化。根据项目经理的说法,无人机将“像狼一样成群结队地狩猎”。14 2016 年,美国空军演示了从三架 F/A- 18战斗机上部署 103 架 Perdix 微型无人机。根据公告,“Perdix 不是预先编程的同步个体,它们是一个集体有机体,共享一个分布式大脑进行决策,并像自然界中的群体一样相互适应。” 15

Since 2014, diplomatic discussions have been underway in Geneva that may lead to a treaty banning AWS. At the same time, some of the major participants in these discussions (the United States, China, Russia, and to some extent Israel and the UK) are engaged in a dangerous competition to develop autonomous weapons. In the United States, for example, the CODE (Collaborative Operations in Denied Environments) program aims to move towards autonomy by enabling drones to function with at best intermittent radio contact. The drones will “hunt in packs, like wolves” according to the program manager.14 In 2016, the US Air Force demonstrated the in-flight deployment of 103 Perdix micro-drones from three F/A-18 fighters. According to the announcement, “Perdix are not pre-programmed synchronized individuals, they are a collective organism, sharing one distributed brain for decision-making and adapting to each other like swarms in nature.”15

你可能认为,制造能够决定杀死人类的机器显然是个坏主意。但“显然”并不总是能说服那些一心想实现其所谓战略优势的政府(包括上一段中列出的一些政府)。拒绝自主武器的一个更有说服力的理由是,它们是可扩展的大规模杀伤性武器

You may think it’s pretty obvious that building machines that can decide to kill humans is a bad idea. But “pretty obvious” is not always persuasive to governments—including some of those listed in the preceding paragraph—who are bent on achieving what they think of as strategic superiority. A more convincing reason to reject autonomous weapons is that they are scalable weapons of mass destruction.

可扩展是计算机科学的一个术语;如果某个流程可以通过购买一百万倍的硬件来增加一百万倍的效率,那么该流程就是可扩展的。因此,谷歌每天处理大约 50 亿次搜索请求,而这并非依靠数百万员工,而是数百万台计算机。有了自主武器,你可以通过购买一百万倍的武器来增加一百万倍的杀戮,而这恰恰是因为这些武器是自主的。与遥控无人机或 AK-47 不同,它们不需要个人监督即可完成工作。

Scalable is a term from computer science; a process is scalable if you can do a million times more of it essentially by buying a million times more hardware. Thus, Google handles roughly five billion search requests per day by having not millions of employees but millions of computers. With autonomous weapons, you can do a million times more killing by buying a million times more weapons, precisely because the weapons are autonomous. Unlike remotely piloted drones or AK-47s, they don’t need individual human supervision to do their work.

作为大规模杀伤性武器,可扩展自主武器与核武器和地毯式轰炸相比,对攻击者具有优势:它们不会破坏财产,并且可以选择性地使用,只消灭那些可能威胁占领军的人。它们当然可以用来消灭整个族群或某个宗教的所有信徒(如果信徒有明显的标志)。此外,虽然使用核武器代表着一种虽然自 1945 年以来我们(通常纯粹是靠运气)一直避免跨越灾难性的门槛,但可扩展的自主武器却没有这样的门槛。攻击可以从一百人伤亡顺利升级到一千人、一万人、十万人。除了实际攻击外,仅仅威胁使用此类武器就使它们成为恐怖和压迫的有效工具。自主武器将大大降低人类在各个层面的安全:个人、地方、国家和国际。

As weapons of mass destruction, scalable autonomous weapons have advantages for the attacker compared to nuclear weapons and carpet bombing: they leave property intact and can be applied selectively to eliminate only those who might threaten an occupying force. They could certainly be used to wipe out an entire ethnic group or all the adherents of a particular religion (if adherents have visible indicia). Moreover, whereas the use of nuclear weapons represents a cataclysmic threshold that we have (often by sheer luck) avoided crossing since 1945, there is no such threshold with scalable autonomous weapons. Attacks could escalate smoothly from one hundred casualties to one thousand to ten thousand to one hundred thousand. In addition to actual attacks, the mere threat of attacks by such weapons makes them an effective tool for terror and oppression. Autonomous weapons will greatly reduce human security at all levels: personal, local, national, and international.

这并不是说自主武器会像《终结者》电影中设想的那样带来世界末日。它们不需要特别智能——自动驾驶汽车可能需要更智能——而且它们的任务也不会是“统治世界”的那种。人工智能带来的生存风险主要不是来自头脑简单的杀手机器人。另一方面,与人类发生冲突的超级智能机器当然可以通过这种方式武装自己,将相对愚蠢的杀手机器人变成全球控制系统的物理延伸。

This is not to say that autonomous weapons will be the end of the world in the way envisaged in the Terminator movies. They need not be especially intelligent—a self-driving car probably needs to be smarter—and their missions will not be of the “take over the world” variety. The existential risk from AI does not come primarily from simple-minded killer robots. On the other hand, superintelligent machines in conflict with humanity could certainly arm themselves this way, by turning relatively stupid killer robots into physical extensions of a global control system.

消除我们已知的工作

Eliminating Work as We Know It

数以千计的媒体文章、评论文章和几本书都在讨论机器人抢走人类工作的问题。世界各地涌现出许多研究中心,试图了解可能发生的事情。16马丁·福特的《机器人的崛起:技术与失业未来的威胁》17卡勒姆·蔡斯的《经济奇点:人工智能与资本主义的消亡》18很好地概括了人们的担忧。虽然,正如很快就会发现的那样,我根本没有资格对本质上属于经济学家的问题发表意见,19但我认为这个问题太重要了,不能完全留给他们去处理。

Thousands of media articles and opinion pieces and several books have been written on the topic of robots taking jobs from humans. Research centers are springing up all over the world to understand what is likely to happen.16 The titles of Martin Ford’s Rise of the Robots: Technology and the Threat of a Jobless Future17 and Calum Chace’s The Economic Singularity: Artificial Intelligence and the Death of Capitalism18 do a pretty good job of summarizing the concern. Although, as will soon become evident, I am by no means qualified to opine on what is essentially a matter for economists,19 I suspect that the issue is too important to leave entirely to them.

一篇著名文章《我们孙辈的经济前景》提出了技术性失业问题。约翰·梅纳德·凯恩斯。他于 1930 年撰写了这篇文章,当时大萧条导致英国大量失业,但这个话题的历史要悠久得多。亚里士多德在其《政治学》第一卷中非常清楚地阐述了主要观点:

The issue of technological unemployment was brought to the fore in a famous article, “Economic Possibilities for Our Grandchildren,” by John Maynard Keynes. He wrote the article in 1930, when the Great Depression had created mass unemployment in Britain, but the topic has a much longer history. Aristotle, in Book I of his Politics, presents the main point quite clearly:

如果每件乐器都能完成自己的工作,服从或预期他人的意愿……如果梭子也能编织,拨子也能触碰七弦琴,而不需要手来引导,那么首席工匠就不会需要仆人,主人也不会需要奴隶。

For if every instrument could accomplish its own work, obeying or anticipating the will of others . . . if, in like manner, the shuttle would weave and the plectrum touch the lyre without a hand to guide them, chief workmen would not want servants, nor masters slaves.

亚里士多德的观察结果显示,当雇主找到一种机械方法来完成以前由人完成的工作时,就业率会立即下降,这一点人人都同意。问题在于,随之而来的所谓补偿效应(这种效应往往会增加就业)最终是否能弥补就业率下降。乐观主义者说会,在目前的辩论中,他们指出了历次工业革命后出现的所有新工作。悲观主义者说不会,在目前的辩论中,他们辩称机器也会完成所有“新工作”。当机器取代了人的体力劳动时,人们可以出售脑力劳动。当机器取代了人的脑力劳动时,人们还有什么可以出售的呢?

Everyone agrees with Aristotle’s observation that there is an immediate reduction in employment when an employer finds a mechanical method to perform work previously done by a person. The issue is whether the so-called compensation effects that ensue—and that tend to increase employment—will eventually make up for this reduction. The optimists say yes—and in the current debate, they point to all the new jobs that emerged after previous industrial revolutions. The pessimists say no—and in the current debate, they argue that machines will do all the “new jobs” too. When a machine replaces one’s physical labor, one can sell mental labor. When a machine replaces one’s mental labor, what does one have left to sell?

在《生命 3.0》中,马克斯·泰格马克将这场争论描述为两匹马在 1900 年讨论内燃机兴起的对话。有人预测“马匹将有新的工作……以前总是这样,就像车轮和犁的发明一样。”可惜的是,对于大多数马来说,“新工作”就是当宠物饲料。

In Life 3.0, Max Tegmark depicts the debate as a conversation between two horses discussing the rise of the internal combustion engine in 1900. One predicts “new jobs for horses. . . . That’s what’s always happened before, like with the invention of the wheel and the plow.” For most horses, alas, the “new job” was to be pet food.

图 8:随着油漆技术的进步,房屋油漆就业的概念图。

FIGURE 8: A notional graph of housepainting employment as painting technology improves.

这场争论已经持续了几千年,因为影响是双向的。实际结果取决于哪种影响更重要。例如,随着技术的进步,油漆工会发生什么变化。为了简单起见,我将用画笔的宽度代表自动化程度:

The debate has persisted for millennia because there are effects in both directions. The actual outcome depends on which effects matter more. Consider, for example, what happens to housepainters as technology improves. For the sake of simplicity, I’ll let the width of the paintbrush stand for the degree of automation:

  • 如果刷子只有一根头发(十分之一毫米)宽,那么粉刷一栋房子就需要数千人一年的时间,而且基本上不需要雇用任何油漆工。

  • If the brush is one hair (a tenth of a millimeter) wide, it takes thousands of person-years to paint a house and essentially no housepainters are employed.

  • 一毫米宽的画笔,也许能让皇宫里的几位画家画出几幅精美的壁画。一厘米宽的画笔,贵族们也开始效仿。

  • With brushes a millimeter wide, perhaps a few delicate murals are painted in the royal palace by a handful of painters. At one centimeter, the nobility begin to follow suit.

  • 当厚度达到十厘米(四英寸)时,我们就达到了实用的境界:大多数房主都会将房屋的里外都粉刷一遍,尽管这种做法可能并不常见,但成千上万的油漆工找到了工作。

  • At ten centimeters (four inches), we reach the realm of practicality: most homeowners have their houses painted inside and out, although perhaps not all that frequently, and thousands of housepainters find jobs.

  • 一旦我们使用宽滚筒和喷枪(相当于约一米宽的油漆刷),价格就会大幅下降,但需求可能开始饱和,因此房屋油漆工的数量会有所下降。

  • Once we get to wide rollers and spray guns—the equivalent of a paintbrush about a meter wide—the price goes down considerably, but demand may begin to saturate so the number of housepainters drops somewhat.

  • 当一个人管理一支由一百个油漆机器人组成的团队时——其生产力相当于一把一百米宽的油漆刷——那么整栋房子可以在一小时内粉刷完毕,而且只需要很少的油漆工在工作。

  • When one person manages a team of one hundred housepainting robots—the productivity equivalent of a paintbrush one hundred meters wide—then whole houses can be painted in an hour and very few housepainters will be working.

因此,技术的直接影响是双向的:首先,通过提高生产力,技术可以通过降低活动价格来增加就业,从而增加需求;随后,技术的进一步发展意味着需要的人力越来越少。图 8说明了这些发展。20

Thus, the direct effects of technology work both ways: at first, by increasing productivity, technology can increase employment by reducing the price of an activity and thereby increasing demand; subsequently, further increases in technology mean that fewer and fewer humans are required. Figure 8 illustrates these developments.20

许多技术都呈现出类似的曲线。如果在经济的某个特定部门,我们处于峰值的左侧,那么改进技术将增加该部门的就业;当今的例子可能包括涂鸦清除、环境清理、集装箱检查和欠发达国家的住房建设等任务,如果我们有机器人来帮助我们,所有这些任务都可能在经济上变得更加可行。如果我们已经处于峰值的右侧,那么进一步的自动化将减少就业。例如,不难预测电梯操作员将继续被挤出市场。从长远来看,我们必须预料到大多数行业将被推到曲线的最右边。最近有一篇文章基于经济学家戴维·奥托尔和安娜·萨洛蒙斯的一项细致的计量经济学研究,指出“在过去 40 年里,每一个引入技术来提高生产力的行业,就业岗位都在减少。” 21

Many technologies exhibit similar curves. If, in some given sector of the economy, we are to the left of the peak, then improving technology increases employment in that sector; present-day examples might include tasks such as graffiti removal, environmental cleanup, inspection of shipping containers, and housing construction in less developed countries, all of which might become more economically feasible if we have robots to help us. If we are already to the right of the peak, then further automation decreases employment. For example, it’s not hard to predict that elevator operators will continue to be squeezed out. In the long run, we have to expect that most industries are going to be pushed to the far right on the curve. One recent article, based on a careful econometric study by economists David Autor and Anna Salomons, states that “over the last 40 years, jobs have fallen in every single industry that introduced technologies to enhance productivity.”21

经济乐观主义者所描述的补偿效应又如何呢?

What about the compensation effects described by the economic optimists?

  • 有些人必须制造油漆机器人。制造多少个呢?数量要远远少于被机器人取代的油漆工数量——否则,用机器人粉刷房屋的成本会更高,而不是更低,而且没人会购买机器人。

  • Some people have to make the painting robots. How many? Far fewer than the number of housepainters the robots replace—otherwise, it would cost more to paint houses with robots, not less, and no one would buy the robots.

  • 房屋粉刷变得稍微便宜了一些,所以人们更频繁地请房屋粉刷工。

  • Housepainting becomes somewhat cheaper, so people call in the housepainters a bit more often.

  • 最后,由于我们在房屋粉刷上的花费减少了,我们就有更多的钱可以花在其他事情上,从而增加了其他行业的就业机会。

  • Finally, because we pay less for housepainting, we have more money to spend on other things, thereby increasing employment in other sectors.

经济学家试图衡量各个自动化程度不断提高的行业中这些影响的规模,但结果通常尚无定论。

Economists have tried to measure the size of these effects in various industries experiencing increased automation, but the results are generally inconclusive.

从历史上看,大多数主流经济学家都从“大局”的角度来论证:自动化提高了生产率,因此,从整体上看,人类的境况会更好,因为我们可以用同样的工作量享受更多的商品和服务。

Historically, most mainstream economists have argued from the “big picture” view: automation increases productivity, so, as a whole, humans are better off, in the sense that we enjoy more goods and services for the same amount of work.

不幸的是,经济理论并没有预测每个人的生活都会因自动化而变得更好。一般来说,自动化会增加流向资本(油漆机器人的所有者)的收入份额,而减少流向劳动力(前油漆工)的收入份额。经济学家埃里克·布林约尔松和安德鲁·麦卡菲在《第二次机器时代》中指出,这种情况已经发生了几十年。图 9显示了美国的数据。他们表明,1947 年至 1973 年间,工资和生产率一起增加,但 1973 年之后,即使生产率大约翻了一番,工资却停滞不前。布林约尔松和麦卡菲称之为“大脱钩”。其他主要经济学家也敲响了警钟,包括诺贝尔奖获得者罗伯特·席勒、迈克·斯宾塞和保罗·克鲁格曼;世界经济论坛主席克劳斯·施瓦布;以及前世界银行首席经济学家兼克林顿总统任期内的财政部长拉里·萨默斯。

Economic theory does not, unfortunately, predict that each human will be better off as a result of automation. Generally, automation increases the share of income going to capital (the owners of the housepainting robots) and decreases the share going to labor (the ex-housepainters). The economists Erik Brynjolfsson and Andrew McAfee, in The Second Machine Age, argue that this has already been happening for several decades. Data for the United States are shown in figure 9. They indicate that between 1947 and 1973, wages and productivity increased together, but after 1973, wages stagnated even while productivity roughly doubled. Brynjolfsson and McAfee call this the Great Decoupling. Other leading economists have also sounded the alarm, including Nobel laureates Robert Shiller, Mike Spence, and Paul Krugman; Klaus Schwab, head of the World Economic Forum; and Larry Summers, former chief economist of the World Bank and Treasury secretary under President Bill Clinton.

反对技术性失业的人经常会提到银行柜员,他们的工作部分可以由 ATM 完成,以及零售收银员,他们的工作可以通过商品上的条形码和 RFID 标签加快。人们经常声称这些职业正在日益因为技术。事实上,从 1970 年到 2010 年,美国出纳员的数量大约翻了一番,但值得注意的是,在同一时期,美国人口增长了 50%,金融业增长了 400% 以上,22因此很难将所有或部分就业增长归因于 ATM。不幸的是,从 2010 年到 2016 年,约有十万出纳员失业,美国劳工统计局 (BLS) 预测到 2026 年还会有四万个工作岗位流失:“网上银行和自动化技术预计将继续取代出纳员传统上执行的更多工作职责。” 23零售收银员的数据也不容乐观:从 1997 年到 2015 年,人均收银员数量下降了 5%,BLS 表示,“技术进步,例如零售店的自助结账台和不断增长的在线销售,将继续限制对收银员的需求。”这两个行业似乎都在走下坡路。几乎所有涉及使用机器的低技能职业都是如此。

Those arguing against the notion of technological unemployment often point to bank tellers, whose work can be done in part by ATMs, and retail cashiers, whose work is sped up by barcodes and RFID tags on merchandise. It is often claimed that these occupations are growing because of technology. Indeed, the number of tellers in the United States roughly doubled from 1970 to 2010, although it should be noted that the US population grew by 50 percent and the financial sector by over 400 percent in the same period,22 so it is difficult to attribute all, or perhaps any, of the employment growth to ATMs. Unfortunately, between 2010 and 2016 about one hundred thousand tellers lost their jobs, and the US Bureau of Labor Statistics (BLS) predicts another forty thousand job losses by 2026: “Online banking and automation technology are expected to continue replacing more job duties that tellers traditionally performed.”23 The data on retail cashiers are no more encouraging: the number per capita dropped by 5 percent from 1997 to 2015, and the BLS says, “Advances in technology, such as self-service checkout stands in retail stores and increasing online sales, will continue to limit the need for cashiers.” Both sectors appear to be on the downslope. The same is true of almost all low-skilled occupations that involve working with machines.

图 9:1947 年以来美国的经济产出和实际平均工资。(数据来自劳工统计局。)

FIGURE 9: Economic production and real median wages in the United States since 1947. (Data from the Bureau of Labor Statistics.)

哪些职业将随着人工智能的兴起而衰落技术什么时候会到来?媒体引用的主要例子就是驾驶。在美国,大约有 350 万卡车司机;这些工作中的很多都会受到自动化的影响。亚马逊和其他公司已经在州际高速公路上使用自动驾驶卡车进行货运,尽管目前仍由人类驾驶员备用。24似乎很有可能,每辆卡车的长途运输部分很快就会实现自动驾驶,而目前,人类将负责城市交通、提货和送货。由于这些预期的发展,很少有年轻人对卡车司机作为职业感兴趣;具有讽刺意味的是,目前美国卡车司机严重短缺,这只会加速自动化的到来。

Which occupations are about to decline as new, AI-based technology arrives? The prime example cited in the media is that of driving. In the United States there are about 3.5 million truck drivers; many of these jobs would be vulnerable to automation. Amazon, among other companies, is already using self-driving trucks for freight haulage on interstate freeways, albeit currently with human backup drivers.24 It seems very likely that the long-haul part of each truck journey will soon be autonomous, while humans, for the time being, will handle city traffic, pickup, and delivery. As a consequence of these expected developments, very few young people are interested in trucking as a career; ironically, there is currently a significant shortage of truck drivers in the Unites States, which is only hastening the onset of automation.

白领工作也面临风险。例如,美国劳工统计局预测,从 2016 年到 2026 年,保险承保人的人均就业人数将下降 13%:“自动承保软件使工作人员能够比以前更快地处理申请,从而减少对承保人的需求。”如果语言技术如预期那样发展,许多销售和客户服务工作也将受到影响,法律行业的工作也是如此。(在 2018 年的一场竞赛中,人工智能软件在分析标准保密协议方面胜过经验丰富的法学教授,完成任务的速度快了两百倍。25 常规形式的计算机编程(如今经常外包的那种)也可能实现自动化。事实上,几乎任何可以外包的东西都是自动化的良好候选者,因为外包涉及将工作分解为可以拆分和以非语境化形式分配的任务。机器人流程自动化行业生产的软件工具正是针对在线执行的文书任务实现这种效果的。

White-collar jobs are also at risk. For example, the BLS projects a 13 percent decline in per-capita employment of insurance underwriters from 2016 to 2026: “Automated underwriting software allows workers to process applications more quickly than before, reducing the need for as many underwriters.” If language technology develops as expected, many sales and customer service jobs will also be vulnerable, as well as jobs in the legal profession. (In a 2018 competition, AI software outscored experienced law professors in analyzing standard nondisclosure agreements and completed the task two hundred times faster.25) Routine forms of computer programming—the kind that is often outsourced today—are also likely to be automated. Indeed, almost anything that can be outsourced is a good candidate for automation, because outsourcing involves decomposing jobs into tasks that can be parceled up and distributed in a decontextualized form. The robot process automation industry produces software tools that achieve exactly this effect for clerical tasks performed online.

随着人工智能的发展,在未来几十年内,几乎所有常规体力和脑力劳动都将由机器以更便宜的价格完成,这绝对是有可能的,甚至很有可能。自从数千年前我们不再是狩猎采集者以来,我们的社会已经将大多数人当作机器人,执行重复的体力和脑力任务,因此机器人很快将承担这些角色也许并不奇怪。当这种情况发生时,那些无法竞争剩余高技能工作的人的工资将跌至贫困线以下。拉里·萨默斯这样说:“考虑到[资本替代劳动力]的可能性,某些类别的劳动力很可能无法获得维持生计的收入。” 26这正是马匹所经历的:机械运输变得比马匹的饲养成本更便宜,因此马匹成了宠物食品。面对成为宠物食品的社会经济等价物,人类将对他们的政府感到不满。

As AI progresses, it is certainly possible—perhaps even likely—that within the next few decades essentially all routine physical and mental labor will be done more cheaply by machines. Since we ceased to be hunter-gatherers thousands of years ago, our societies have used most people as robots, performing repetitive manual and mental tasks, so it is perhaps not surprising that robots will soon take on these roles. When this happens, it will push wages below the poverty line for those people who are unable to compete for the highly skilled jobs that remain. Larry Summers put it this way: “It may well be that, given the possibilities for substitution [of capital for labor], some categories of labor will not be able to earn a subsistence income.”26 This is precisely what happened to the horses: mechanical transportation became cheaper than the upkeep cost of a horse, so horses became pet food. Faced with the socioeconomic equivalent of becoming pet food, humans will be rather unhappy with their governments.

面对可能不快乐的人类,世界各国政府开始关注这一问题。大多数政府已经发现,将每个人都重新培训为数据科学家或机器人工程师的想法是行不通的——世界可能需要五百万或一千万这样的人,但远远达不到十亿左右处于危险中的工作岗位。数据科学对于一艘巨型游轮来说只是一艘非常小的救生艇。27

Faced with potentially unhappy humans, governments around the world are beginning to devote some attention to the issue. Most have already discovered that the idea of retraining everyone as a data scientist or robot engineer is a nonstarter—the world might need five or ten million of these, but nowhere close to the billion or so jobs that are at risk. Data science is a very tiny lifeboat for a giant cruise ship.27

一些人正在制定“过渡计划”——但过渡到什么呢?我们需要一个合理的目标来规划过渡——也就是说,我们需要一个理想的未来经济的合理图景,在这个经济中,我们目前所谓的大部分工作都是由机器完成的。

Some are working on “transition plans”—but transition to what? We need a plausible destination in order to plan a transition—that is, we need a plausible picture of a desirable future economy where most of what we currently call work is done by machines.

一个迅速浮现的图景是,经济中工作的人越来越少,因为工作是不必要的。凯恩斯在他的文章《我们孙辈的经济可能性》中设想了这样的未来。他将 1930 年困扰英国的高失业率描述为“暂时的失调阶段”,这是由“技术效率的提高”造成的,而“技术效率的提高”发生的速度“快于我们处理劳动力吸收问题的速度”。然而,他没有想到,从长远来看——经过一个世纪的技术进步——会恢复到充分就业:

One rapidly emerging picture is that of an economy where far fewer people work because work is unnecessary. Keynes envisaged just such a future in his essay “Economic Possibilities for Our Grandchildren.” He described the high unemployment afflicting Great Britain in 1930 as a “temporary phase of maladjustment” caused by an “increase of technical efficiency” that took place “faster than we can deal with the problem of labour absorption.” He did not, however, imagine that in the long run—after a century of further technological advances—there would be a return to full employment:

因此,自人类诞生以来,人类第一次面临着真正的、永恒的问题——如何利用摆脱经济压力的自由,如何利用闲暇时间,科学复利将帮助他过上明智、愉快、幸福的生活。

Thus for the first time since his creation man will be faced with his real, his permanent problem—how to use his freedom from pressing economic cares, how to occupy the leisure, which science and compound interest will have won for him, to live wisely and agreeably and well.

这样的未来需要我们的经济体系进行彻底的变革,因为在许多国家,不工作的人面临贫困或赤贫。因此,凯恩斯愿景的现代支持者通常支持某种形式的全民基本收入(UBI)。UBI 由增值税或资本收入税资助,将为每个成年人提供合理的收入,无论其情况如何。那些渴望更高生活水平的人仍然可以工作而不会失去 UBI,而那些不这样做的人可以按照自己认为合适的方式度过时间。也许令人惊讶的是,UBI 得到了整个政治派别的支持,从亚当斯密研究所28到绿党29 。

Such a future requires a radical change in our economic system, because, in many countries, those who do not work face poverty or destitution. Thus, modern proponents of Keynes’s vision usually support some form of universal basic income, or UBI. Funded by value-added taxes or by taxes on income from capital, UBI would provide a reasonable income to every adult, regardless of circumstance. Those who aspire to a higher standard of living can still work without losing the UBI, while those who do not can spend their time as they see fit. Perhaps surprisingly, UBI has support across the political spectrum, ranging from the Adam Smith Institute28 to the Green Party.29

对于某些人来说,UBI 代表着某种天堂。30对于其他人来说,它代表着对失败的承认——断言大多数人对社会都没有任何经济价值的贡献。他们可以获得食物和住房——大部分是由机器提供——但除此之外,就只能任其自生自灭了。真相一如既往地介于两者之间,而且很大程度上取决于人们如何看待人类心理学。凯恩斯在他的文章中明确区分了奋斗者和享受者——那些“有目的的”人,“对他们来说,除非是明天的麻烦,否则麻烦就不是麻烦,而今天永远不会麻烦”和那些“令人愉悦的”人,“能够直接享受事物”。UBI 提案假设绝大多数人都属于令人愉悦的那一类。

For some, UBI represents a version of paradise.30 For others, it represents an admission of failure—an assertion that most people will have nothing of economic value to contribute to society. They can be fed and housed—mostly by machines—but otherwise left to their own devices. The truth, as always, lies somewhere in between, and it depends largely on how one views human psychology. Keynes, in his essay, made a clear distinction between those who strive and those who enjoy—those “purposive” people for whom “jam is not jam unless it is a case of jam to-morrow and never jam to-day” and those “delightful” people who are “capable of taking direct enjoyment in things.” The UBI proposal assumes that the great majority of people are of the delightful variety.

凯恩斯认为,奋斗是“普通人的习惯和本能,代代相传”,而不是“生活的真正价值”。他预测这种本能将逐渐消失。与此观点相反,有人可能会认为奋斗是真正人性的内在组成部分。奋斗和享受并非相互排斥,它们往往密不可分:真正的享受和持久的满足感来自于有目标和实现目标(或至少努力实现目标),通常是在面临障碍的情况下,而不是被动地享受眼前的快乐。攀登珠穆朗玛峰和乘坐直升机到达山顶是有区别的。

Keynes suggests that striving is one of the “habits and instincts of the ordinary man, bred into him for countless generations” rather than one of the “real values of life.” He predicts that this instinct will gradually disappear. Against this view, one may suggest that striving is intrinsic to what it means to be truly human. Rather than striving and enjoying being mutually exclusive, they are often inseparable: true enjoyment and lasting fulfillment come from having a purpose and achieving it (or at least trying), usually in the face of obstacles, rather than from passive consumption of immediate pleasure. There is a difference between climbing Everest and being deposited on top by helicopter.

奋斗与享受之间的联系是我们理解如何塑造理想未来的中心主题。也许后代会想知道我们为什么要担心“工作”这样无用的事情。为了以防这种态度的改变来得缓慢,让我们考虑一下这种观点的经济含义:大多数人如果有有用的事情可做会过得更好,即使绝大多数商品和服务将由机器在很少的人工监督下生产。不可避免的是,大多数人将从事提供人际服务,而这些服务只能由人类提供——或者我们更希望由人类提供。也就是说,如果我们不能再提供常规的体力劳动和常规的脑力劳动,我们仍然可以提供我们的人性。我们需要成为善于做人的人。31

The connection between striving and enjoying is a central theme for our understanding of how to fashion a desirable future. Perhaps future generations will wonder why we ever worried about such a futile thing as “work.” Just in case that change in attitudes is slow in coming, let’s consider the economic implications of the view that most people will be better off with something useful to do, even though the great majority of goods and services will be produced by machines with very little human supervision. Inevitably, most people will be engaged in supplying interpersonal services that can be provided—or which we prefer to be provided—only by humans. That is, if we can no longer supply routine physical labor and routine mental labor, we can still supply our humanity. We will need to become good at being human.31

目前这类职业包括心理治疗师、高管教练、导师、咨询师、陪护人员以及儿童和老人护理人员。护理人员职业这一短语经常用于这种语境,但这是一种误导:对于提供护理的人来说,它当然具有积极的含义,但对于接受护理的人来说,它却具有依赖和无助的消极含义。但请考虑一下凯恩斯的这一观察:

Current professions of this kind include psychotherapists, executive coaches, tutors, counselors, companions, and those who care for children and the elderly. The phrase caring professions is often used in this context, but that is misleading: it has a positive connotation for those providing care, to be sure, but a negative connotation of dependency and helplessness for the recipients of care. But consider this observation, again from Keynes:

那些能够保持生活艺术本身并使其更加完美,不为生存手段出卖自己的人,将能够享受到丰盛的成果。

It will be those peoples, who can keep alive, and cultivate into a fuller perfection, the art of life itself and do not sell themselves for the means of life, who will be able to enjoy the abundance when it comes.

我们所有人都需要帮助来学习“生活本身的艺术”。这不是依赖的问题,而是成长的问题。激励他人的能力,赋予欣赏和创造的能力——无论是在艺术、音乐、文学、对话、园艺、建筑、美食、美酒或视频游戏——可能比以往任何时候都更需要。

All of us need help in learning “the art of life itself.” This is not a matter of dependency but of growth. The capacity to inspire others and to confer the ability to appreciate and to create—be it in art, music, literature, conversation, gardening, architecture, food, wine, or video games—is likely to be more needed than ever.

下一个问题是收入分配。在大多数国家,收入分配问题几十年来一直朝着错误的方向发展。这是一个复杂的问题,但有一点是明确的:高收入和高社会地位通常源于提供高附加值。举个例子,儿童保育这个职业与低收入和低社会地位有关。这在一定程度上是因为我们真的不知道该怎么做。一些从业者天生就擅长这个,但很多人却不是。与此形成鲜明对比的是,比如骨科手术。我们不会雇佣那些需要一点闲钱的无聊青少年,让他们以每小时 5 美元的薪水和冰箱里的食物作为骨科医生。我们已经花了几个世纪的研究来了解人体以及如何修复受损的人体,从业者必须接受多年的培训来学习所有这些知识和应用这些知识所需的技能。因此,骨科医生的薪水很高,也很受尊重。他们之所以收入颇高,不仅是因为他们知识渊博、接受过大量培训,还因为他们的知识和培训确实有用。这让他们能够为其他人的生活带来巨大的价值——尤其是那些身体残缺的人。

The next question is income distribution. In most countries, this has been moving in the wrong direction for several decades. It’s a complex issue, but one thing is clear: high incomes and high social standing usually follow from providing high added value. The profession of childcare, to pick one example, is associated with low incomes and low social standing. This is, in part, a consequence of the fact that we don’t really know how to do it. Some practitioners are naturally good at it, but many are not. Contrast this with, say, orthopedic surgery. We wouldn’t just hire bored teenagers who need a bit of spare cash and put them to work as orthopedic surgeons at five dollars an hour plus all they can eat from the fridge. We have put centuries of research into understanding the human body and how to fix it when it’s broken, and practitioners must undergo years of training to learn all this knowledge and the skills necessary to apply it. As a result, orthopedic surgeons are highly paid and highly respected. They are highly paid not just because they know a lot and have a lot of training but also because all that knowledge and training actually works. It enables them to add a great deal of value to other people’s lives—especially people with broken bits.

不幸的是,我们对心灵的科学理解令人震惊地薄弱,我们对幸福和成就感的科学理解就更加薄弱了。我们根本不知道如何以一致、可预测的方式为彼此的生活增添价值。我们在某些精神疾病方面取得了一定的成功,但我们仍在为像教孩子读书识字这样基本的事情打百年扫盲战争。32我们需要彻底反思我们的教育体系和科学事业,将更多注意力放在人性而不是物质世界。(东北大学校长约瑟夫·奥恩认为,大学应该教授和研究“人文科学”。33 说幸福应该是一门工程学科听起来很奇怪,但这似乎是不可避免的结论。这种学科将以基础科学为基础——更好地理解人类思维在认知和情感层面如何运作——并将培养各种各样的从业者,从帮助个人规划其生活轨迹的人生建筑师,到增强好奇心和个人适应力等主题的专业专家。如果以真正的科学为基础,这些职业就不必比今天的桥梁设计师和骨科医生更虚幻。

Unfortunately, our scientific understanding of the mind is shockingly weak and our scientific understanding of happiness and fulfillment is even weaker. We simply don’t know how to add value to each other’s lives in consistent, predictable ways. We have had moderate success with certain psychiatric disorders, but we are still fighting a Hundred Years’ Literacy War over something as basic as teaching children to read.32 We need a radical rethinking of our educational system and our scientific enterprise to focus more attention on the human rather than the physical world. (Joseph Aoun, president of Northeastern University, argues that universities should be teaching and studying “humanics.”33) It sounds odd to say that happiness should be an engineering discipline, but that seems to be the inevitable conclusion. Such a discipline would build on basic science—a better understanding of how human minds work at the cognitive and emotional levels—and would train a wide variety of practitioners, ranging from life architects, who help individuals plan the overall shape of their life trajectories, to professional experts in topics such as curiosity enhancement and personal resilience. If based on real science, these professions need be no more woo-woo than bridge designers and orthopedic surgeons are today.

重新调整我们的教育和研究机构,以创建这门基础科学并将其转化为培训计划和认证职业将需要几十年的时间,所以现在开始是个好主意,可惜我们没有早点开始。最终的结果——如果奏效的话——将是一个值得生活的世界。如果没有这样的重新思考,我们就会面临不可持续的社会经济混乱的风险。

Reworking our education and research institutions to create this basic science and to convert it into training programs and credentialed professions will take decades, so it’s a good idea to start now and a pity we didn’t start long ago. The final result—if it works—would be a world well worth living in. Without such a rethinking, we risk an unsustainable level of socioeconomic dislocation.

篡夺人类的其他角色

Usurping Other Human Roles

在允许机器接管人际服务角色之前,我们应该三思而行。如果说人类身份是我们向其他人类展示的主要卖点,那么制造模仿人类的机器似乎是一个坏主意。幸运的是,在了解其他人类的感受和反应方面,我们比机器具有明显的优势。几乎每个人都知道用锤子敲打拇指或感受单相思的感觉。

We should think twice before allowing machines to take over roles involving interpersonal services. If being human is our main selling point to other humans, so to speak, then making imitation humans seems like a bad idea. Fortunately for us, we have a distinct advantage over machines when it comes to knowing how other humans feel and how they will react. Nearly every human knows what it’s like to hit one’s thumb with a hammer or to feel unrequited love.

人类的这种自然优势被人类的自然劣势抵消了:容易被外表欺骗——尤其是人类的外表。艾伦·图灵警告不要让机器人变得像人类:34

Counteracting this natural human advantage is a natural human disadvantage: the tendency to be fooled by appearances—especially human appearances. Alan Turing warned against making robots resemble humans:34

我当然希望并且相信,人们不会花费巨大精力去制造具有最明显人类特征但非智力的特征(例如人体形状)的机器;在我看来,这样的尝试是完全徒劳的,而且其结果会像人造花一样令人不快。

I certainly hope and believe that no great efforts will be put into making machines with the most distinctively human, but non-intellectual, characteristics such as the shape of the human body; it appears to me quite futile to make such attempts and their results would have something like the unpleasant quality of artificial flowers.

不幸的是,图灵的警告没有被重视。几个研究小组已经制作出栩栩如生的机器人,如图 10所示。

Unfortunately, Turing’s warning has gone unheeded. Several research groups have produced eerily lifelike robots, as shown in figure 10.

作为研究工具,这些机器人或许能让我们深入了解人类如何理解机器人的行为和交流。作为未来商业产品的原型,它们代表了一种不诚实的形式。它们绕过我们的意识,直接诉诸我们的情感自我,或许能让我们相信它们具有真正的智慧。例如,想象一下,关掉并回收一个出现故障的矮胖灰色盒子(即使它嘎嘎叫着不想被关掉)要比关掉 JiaJia 或 Geminoid DK 容易得多。再想象一下,如果婴儿和幼儿被看起来像人类的实体(如他们的父母)照顾,但不知何故却不是;这些实体看起来像他们的父母一样关心他们,但实际上却不关心他们,这会是多么令人困惑,甚至可能在心理上令人不安。

As research tools, the robots may provide insights into how humans interpret robot behavior and communication. As prototypes for future commercial products, they represent a form of dishonesty. They bypass our conscious awareness and appeal directly to our emotional selves, perhaps convincing us that they are endowed with real intelligence. Imagine, for example, how much easier it would be to switch off and recycle a squat, gray box that was malfunctioning—even if it was squawking about not wanting to be switched off—than it would be to do the same for JiaJia or Geminoid DK. Imagine also how confusing and perhaps psychologically disturbing it would be for babies and small children to be cared for by entities that appear to be human, like their parents, but are somehow not; that appear to care about them, like their parents, but in fact do not.

图 10:(左)中国科学技术大学制造的机器人 JiaJia;(右)日本大阪大学石黑浩设计的机器人 Geminoid DK,其原型是丹麦奥尔堡大学的 Henrik Schärfe。

FIGURE 10: (left) JiaJia, a robot built at the University of Science and Technology of China; (right) Geminoid DK, a robot designed by Hiroshi Ishiguro at Osaka University in Japan and modeled on Henrik Schärfe of Aalborg University in Denmark.

除了通过面部表情和动作传达非语言信息的基本能力(就连兔八哥都能轻松做到)之外,机器人没有理由拥有人形。拥有人形也有很好的、实际的理由——例如,与四足运动相比,我们的双足姿势相对不稳定。狗、猫和马与我们的生活很相配,它们的外貌很好地暗示了它们的行为方式。(想象一下,如果一匹马突然开始像狗一样行事!)机器人也应该如此。也许四条腿、两条手臂、半人马般的形态是一个很好的标准。一个精确的人形机器人就像一辆最高时速为 5 英里的法拉利或一个用甜菜根色碎肝奶油制成的“覆盆子”冰淇淋一样有意义。

Beyond a basic capability to convey nonverbal information via facial expression and movement—which even Bugs Bunny manages to do with ease—there is no good reason for robots to have humanoid form. There are also good, practical reasons not to have humanoid form—for example, our bipedal stance is relatively unstable compared to quadrupedal locomotion. Dogs, cats, and horses fit into our lives well, and their physical form is a very good clue as to how they are likely to behave. (Imagine if a horse suddenly started behaving like a dog!) The same should be true of robots. Perhaps a four-legged, two-armed, centaur-like morphology would be a good standard. An accurately humanoid robot makes as much sense as a Ferrari with a top speed of five miles per hour or a “raspberry” ice-cream cone made from beetroot-tinted cream of chopped liver.

一些机器人的人形特征已经导致了政治和情感混乱。2017 年 10 月 25 日,沙特阿拉伯授予索菲亚公民身份,索菲亚是一个人形机器人,被描述为“一个有脸的聊天机器人” 35甚至更糟。36也许这只是一个公关噱头,但欧洲议会法律事务委员会提出的一项提案是完全严肃的。37建议

The humanoid aspect of some robots has already contributed to political as well as emotional confusion. On October 25, 2017, Saudi Arabia granted citizenship to Sophia, a humanoid robot that has been described as little more than “a chatbot with a face”35 and worse.36 Perhaps this was a public relations stunt, but a proposal emanating from the European Parliament’s Committee on Legal Affairs is entirely serious.37 It recommends

从长远来看,应该为机器人确立特定的法律地位,以便至少最先进的自主机器人能够被确立为电子人,并负责赔偿其可能造成的任何损害。

creating a specific legal status for robots in the long run, so that at least the most sophisticated autonomous robots could be established as having the status of electronic persons responsible for making good any damage they may cause.

换句话说,机器人本身将对损害承担法律责任,而不是所有者或制造商。这意味着机器人将拥有金融资产,如果不遵守规定,它们将受到制裁。从字面上看,这毫无意义。例如,如果我们因不付款而监禁机器人,它为什么要关心呢?

In other words, the robot itself would be legally responsible for damage, rather than the owner or manufacturer. This implies that robots will own financial assets and be subject to sanctions if they do not comply. Taken literally, this does not make sense. For example, if we were to imprison the robot for nonpayment, why would it care?

除了不必要甚至荒谬地提高机器人的地位之外,在影响人们的决定将降低人类的地位和尊严。科幻电影《极乐空间》中的一个场景完美地说明了这种可能性,当时马克斯(马特·达蒙饰)向他的“假释官”(图 11)申辩,解释为什么延长他的刑期是不合理的。不用说,马克斯没有成功。假释官甚至责备他没有表现出适当的恭敬态度。

In addition to the needless and even absurd elevation of the status of robots, there is a danger that the increased use of machines in decisions affecting people will degrade the status and dignity of humans. This possibility is illustrated perfectly in a scene from the science-fiction movie Elysium, when Max (Matt Damon) pleads his case before his “parole officer” (figure 11) to explain why the extension of his sentence is unjustified. Needless to say, Max is unsuccessful. The parole officer even chides him for failing to display a suitably deferential attitude.

图 11:麦克斯(马特·达蒙饰)在极乐空间会见了他的假释官。

FIGURE 11: Max (Matt Damon) meets his parole officer in Elysium.

我们可以以两种方式来看待这种对人类尊严的侵犯。第一种是显而易见的:通过赋予机器凌驾于人类之上的权威,我们将自己贬低为二等公民,并失去参与影响我们自身的决策的权利。(更极端的形式是赋予机器杀人的权威,如本章前面所讨论的。)第二种是间接的:即使你认为不是机器在做决定,而是那些设计和委托机器的人类,那些人类设计者和委托者认为在这种情况下不值得权衡每个人类个体的个人情况,这一事实表明他们不重视他人的生命。这或许是人类服务的精英阶层与机器服务和控制的庞大底层阶级之间开始出现巨大分化的征兆。

One can think of such an assault on human dignity in two ways. The first is obvious: by giving machines authority over humans, we relegate ourselves to a second-class status and lose the right to participate in decisions that affect us. (A more extreme form of this is giving machines the authority to kill humans, as discussed earlier in the chapter.) The second is indirect: even if you believe it is not the machines making the decision but those humans who designed and commissioned the machines, the fact that those human designers and commissioners do not consider it worthwhile to weigh the individual circumstances of each human subject in such cases suggests that they attach little value to the lives of others. This is perhaps a symptom of the beginning of a great separation between an elite served by humans and a vast underclass served, and controlled, by machines.

在欧盟,2018 年《通用数据保护条例》(GDPR)第 22 条明确禁止在以下情况下向机器授予权限:

In the EU, Article 22 of the 2018 General Data Protection Regulation, or GDPR, explicitly forbids the granting of authority to machines in such cases:

数据主体有权不受仅基于自动化处理(包括分析)的决定的约束,该决定对其产生法律效力或类似重大影响。

The data subject shall have the right not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her.

虽然这在原则上听起来令人钦佩,但至少在撰写本文时,这在实践中会产生多大影响仍有待观察。将决策留给机器往往更容易、更快捷、更便宜。

Although this sounds admirable in principle, it remains to be seen—at least at the time of writing—how much impact this will have in practice. It is often so much easier, faster, and cheaper to leave the decisions to the machine.

人们对自动化决策忧心忡忡的一个原因是算法偏见的可能性——机器学习算法往往会在贷款、住房、工作、保险、假释、量刑、大学录取等方面做出带有不恰当偏见的决策。几十年来,在许多国家,明确使用种族等标准来做这些决策都是违法的,而且《GDPR》第 9 条禁止在很多应用中使用种族标准。当然,这并不意味着通过将种族排除在数据之外,我们就能做出不带种族偏见的决策。例如,从 1930 年代开始,政府批准的红线做法导致美国某些邮政编码区域禁止抵押贷款和其他形式的投资,从而导致房地产价值下跌。而这些邮政编码区域的居民大多是非裔美国人。

One reason for all the concern about automated decisions is the potential for algorithmic bias—the tendency of machine learning algorithms to produce inappropriately biased decisions about loans, housing, jobs, insurance, parole, sentencing, college admission, and so on. The explicit use of criteria such as race in these decisions has been illegal for decades in many countries and is prohibited by Article 9 of the GDPR for a very wide range of applications. That does not mean, of course, that by excluding race from the data we necessarily get racially unbiased decisions. For example, beginning in the 1930s, the government-sanctioned practice of redlining caused certain zip codes in the United States to be off-limits for mortgage lending and other forms of investment, leading to declining real-estate values. It just so happened that those zip codes were largely populated by African Americans.

为了防止被红线划定,现在只能使用五位数邮政编码的前三位数字来做出信贷决策。此外,决策过程必须接受检查,以确保没有其他“意外”偏见悄然出现。欧盟的《GDPR》通常被认为为任何自动化决策提供了一般的“解释权”,38但第 14 条的实际措辞仅仅要求

To prevent redlining, now only the first three digits of the five-digit zip code can be used in making credit decisions. In addition, the decision process must be amenable to inspection, to ensure no other “accidental” biases are creeping in. The EU’s GDPR is often said to provide a general “right to an explanation” for any automated decision,38 but the actual language of Article 14 merely requires

关于所涉及逻辑的有意义的信息,以及此类处理对数据主体的意义和预期后果。

meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject.

目前,尚不清楚法院将如何执行该条款。可能这位倒霉的消费者只会收到一份用于训练做出决策的分类器的特定深度学习算法的描述。

At present, it is unknown how courts will enforce this clause. It’s possible that the hapless consumer will just be handed a description of the particular deep learning algorithm used to train the classifier that made the decision.

如今,算法偏见的可能原因在于数据,而不是公司的故意渎职。2015 年,《魅力》杂志报道了一个令人失望的发现:“谷歌搜索‘CEO’的第一个女性图片结果出现在第 12 行——而且是芭比娃娃。”(2018 年的结果中有一些真正的女性,但她们中的大多数都是在普通照片中扮演 CEO 的模特,而不是真正的女性 CEO;2019 年的结果稍好一些。)这不是谷歌图片搜索排名中故意存在的性别偏见的结果,而是由于产生数据的文化中预先存在的偏见:男性 CEO 的数量远远多于女性 CEO,当人们想在带标题的图片中描绘一个“原型”CEO 时,他们几乎总是选择男性形象。当然,偏见主要存在于数据中这一事实并不意味着没有义务采取措施来抵消这一问题。

Nowadays, the likely causes of algorithmic bias lie in the data rather than in the deliberate malfeasance of corporations. In 2015, Glamour magazine reported a disappointing finding: “The first female Google image search result for ‘CEO’ appears TWELVE rows down—and it’s Barbie.” (There were some actual women in the 2018 results, but most of them were models portraying CEOs in generic stock photos, rather than actual female CEOs; the 2019 results are somewhat better.) This is a consequence not of deliberate gender bias in Google’s image search ranking but of preexisting bias in the culture that produces the data: there are far more male than female CEOs, and when people want to depict an “archetypal” CEO in a captioned image, they almost always pick a male figure. The fact that the bias lies primarily in the data does not, of course, mean that there is no obligation to take steps to counteract the problem.

还有其他更技术性的原因可以解释为什么机器学习方法的简单应用会产生有偏见的结果。例如,从定义上讲,少数群体在整个人口数据样本中的代表性较差;因此,如果少数群体的个别成员的预测主要基于同一群体其他成员的数据,那么对少数群体成员的预测可能不太准确。幸运的是,人们已经非常关注如何消除机器学习算法中的无意偏见,现在有一些方法可以根据几个合理且理想的公平定义产生无偏见的结果。39这些公平定义的数学分析表明,它们不能同时实现,并且如果强制执行,它们会导致预测准确性降低,在贷款决策的情况下,贷方的利润也会降低。这也许令人失望,但至少它明确了避免算法偏见所涉及的权衡。我们希望这些方法和问题本身的意识能够在政策制定者、从业者和用户中迅速传播。

There are other, more technical reasons why the naïve application of machine learning methods can produce biased outcomes. For example, minorities are, by definition, less well represented in population-wide data samples; hence, predictions for individual members of minorities may be less accurate if such predictions are made largely on the basis of data from other members of the same group. Fortunately, a good deal of attention has been paid to the problem of removing inadvertent bias from machine learning algorithms, and there are now methods that produce unbiased results according to several plausible and desirable definitions of fairness.39 The mathematical analysis of these definitions of fairness shows that they cannot be achieved simultaneously and that, when enforced, they result in lower prediction accuracy and, in the case of lending decisions, lower profit for the lender. This is perhaps disappointing, but at least it makes clear the trade-offs involved in avoiding algorithmic bias. One hopes that awareness of these methods and of the issue itself will spread quickly among policy makers, practitioners, and users.

如果将对个人的权力交给机器有时会有问题,那么对许多人的权力又如何呢?也就是说,我们是否应该让机器担任政治和管理角色?目前这似乎有些牵强。机器无法维持长时间的对话,并且缺乏对与做出广泛决策相关的因素的基本理解,例如是否提高最低工资或拒绝另一家公司的合并提议。然而,趋势很明显:机器在许多领域做出决策的权力越来越大。以航空公司为例。首先,计算机帮助制定航班时刻表。很快,它们接管了机组人员的分配、座位预订和日常维护的管理。接下来,它们连接到全球信息网络,向航空公司管理人员提供实时状态报告,以便管理人员能够有效应对中断。现在他们正在接管管理中断的工作:重新安排飞机路线、重新安排员工、重新预订乘客和修改维护计划。

If handing authority over individual humans to machines is sometimes problematic, what about authority over lots of humans? That is, should we put machines in political and management roles? At present this may seem far-fetched. Machines cannot sustain an extended conversation and lack the basic understanding of the factors that are relevant to making decisions with broad scope, such as whether to raise the minimum wage or to reject a merger proposal from another corporation. The trend, however, is clear: machines are making decisions at higher and higher levels of authority in many areas. Take airlines, for example. First, computers helped in the construction of flight schedules. Soon, they took over allocation of flight crews, the booking of seats, and the management of routine maintenance. Next, they were connected to global information networks to provide real-time status reports to airline managers, so that managers could cope with disruption effectively. Now they are taking over the job of managing disruption: rerouting planes, rescheduling staff, rebooking passengers, and revising maintenance schedules.

从航空经济和乘客体验的角度来看,这都是好事。问题是,计算机系统是否仍然是人类的工具,或者人类是否成为了计算机系统的工具——在必要时提供信息和修复错误,但不再深入了解整个系统的工作原理。当系统崩溃并引发全球混乱,直到系统恢复在线时,答案就会变得清晰。例如,2018 年 4 月 3 日的一次“计算机故障”导致欧洲 1.5 万个航班严重延误或取消。40交易算法导致纽约证券交易所 2010 年“闪电崩盘”时,几分钟内蒸发了 1 万亿美元,唯一的解决办法是关闭交易所。当时发生的事情仍然不太清楚。

This is all to the good from the point of view of airline economics and passenger experience. The question is whether the computer system remains a tool of humans, or humans become tools of the computer system—supplying information and fixing bugs when necessary, but no longer understanding in any depth how the whole thing is working. The answer becomes clear when the system goes down and global chaos ensues until it can be brought back online. For example, a single “computer glitch” on April 3, 2018, caused fifteen thousand flights in Europe to be significantly delayed or canceled.40 When trading algorithms caused the 2010 “flash crash” on the New York Stock Exchange, wiping out $1 trillion in a few minutes, the only solution was to shut down the exchange. What happened is still not well understood.

在出现任何技术之前,人类和大多数动物一样,只能勉强糊口。可以说,我们直接站在地上。技术逐渐将我们提升到机器金字塔上,增加了我们作为个体和物种的足迹。我们可以通过不同的方式来设计人与机器之间的关系。如果我们设计它以便人类保留足够的理解力、权威和自主权,那么系统的技术部分可以大大放大人类的能力,让我们每个人都站在一个巨大的能力金字塔上——如果你愿意的话,可以称其为半神。但请考虑一下在线购物履行仓库中的工人。她比她的前辈更有效率,因为她有一小队机器人带着她的储物箱来挑选物品;但她是一个更大的系统的一部分,由智能算法控制,这些算法决定她应该站在哪里以及她应该挑选和分发哪些物品。她已经被部分埋在金字塔里,而不是站在金字塔顶上。沙子填满金字塔的空隙,她的角色被消除只是时间问题。

Before there was any technology, human beings lived, like most animals, hand to mouth. We stood directly on the ground, so to speak. Technology gradually raised us up on a pyramid of machinery, increasing our footprint as individuals and as a species. There are different ways we can design the relationship between humans and machines. If we design it so that humans retain sufficient understanding, authority, and autonomy, the technological parts of the system can greatly magnify human capabilities, allowing each of us to stand on a vast pyramid of capabilities—a demigod, if you like. But consider the worker in an online-shopping fulfillment warehouse. She is more productive than her predecessors because she has a small army of robots bringing her storage bins to pick items from; but she is a part of a larger system controlled by intelligent algorithms that decide where she should stand and which items she should pick and dispatch. She is already partly buried in the pyramid, not standing on top of it. It’s only a matter of time before the sand fills the spaces in the pyramid and her role is eliminated.

5

5

过度智能的人工智能

OVERLY INTELLIGENT AI

大猩猩问题

The Gorilla Problem

无需太多想象力就能看出,制造比自己更聪明的东西可能是一个坏主意。我们知道,我们对环境和其他物种的控制是我们的智慧的结果,因此,一想到其他东西比我们更聪明——无论是机器人还是外星人——就会立即产生一种不安的感觉。

It doesn’t require much imagination to see that making something smarter than yourself could be a bad idea. We understand that our control over our environment and over other species is a result of our intelligence, so the thought of something else being more intelligent than us—whether it’s a robot or an alien—immediately induces a queasy feeling.

大约一千万年前,现代大猩猩的祖先(当然是无意中)创造了现代人类的遗传谱系。大猩猩对此有何感想?显然,如果它们能够告诉我们它们物种相对于人类的现状,那么大家的共识将非常负面。除了我们允许的范围之外,它们物种基本上没有未来。我们不想在超级智能机器面前陷入类似的境地。我称之为大猩猩问题——具体来说,就是人类能否在一个拥有更高智能的机器的世界中保持其至高无上的地位和自主权的问题。

Around ten million years ago, the ancestors of the modern gorilla created (accidentally, to be sure) the genetic lineage leading to modern humans. How do the gorillas feel about this? Clearly, if they were able to tell us about their species’ current situation vis-à-vis humans, the consensus opinion would be very negative indeed. Their species has essentially no future beyond that which we deign to allow. We do not want to be in a similar situation vis-à-vis superintelligent machines. I’ll call this the gorilla problem—specifically, the problem of whether humans can maintain their supremacy and autonomy in a world that includes machines with substantially greater intelligence.

查尔斯·巴贝奇和艾达·洛夫莱斯,他们设计和编写1842 年,分析机的程序问世,人们意识到了它的潜力,但似乎对此毫无顾忌。1然而,1847 年,宗教杂志《原始阐释者》的编辑理查德·桑顿机械计算器大加抨击:2

Charles Babbage and Ada Lovelace, who designed and wrote programs for the Analytical Engine in 1842, were aware of its potential but seemed to have no qualms about it.1 In 1847, however, Richard Thornton, editor of the Primitive Expounder, a religious journal, railed against mechanical calculators:2

心灵……通过发明机器来进行自己的思考,超越了自身并消除了自身存在的必要性……但是谁知道,当这些机器达到更大的完美程度时,也许不会想出一个计划来弥补它们自身的所有缺陷,然后磨出超出凡人心灵理解范围的想法!

Mind . . . outruns itself and does away with the necessity of its own existence by inventing machines to do its own thinking. . . . But who knows that such machines when brought to greater perfection, may not think of a plan to remedy all their own defects and then grind out ideas beyond the ken of mortal mind!

这也许是有关计算设备存在的风险的第一个猜测,但它仍然不为人知。

This is perhaps the first speculation concerning existential risk from computing devices, but it remained in obscurity.

相比之下,塞缪尔·巴特勒 1872 年出版的小说《埃瑞璜》则对这一主题进行了更为深入的阐述,并立即获得了成功。埃瑞璜是一个国家,在机械主义者和反机械主义者之间发生了可怕的内战之后,所有机械设备都被禁止使用。书中有一部分名为“机器之书”,解释了这场战争的起源,并提出了双方的论点。3令人毛骨悚然地预见了 21 世纪初重新出现的辩论。

In contrast, Samuel Butler’s novel Erewhon, published in 1872, developed the theme in far greater depth and achieved immediate success. Erewhon is a country in which all mechanical devices have been banned after a terrible civil war between the machinists and anti-machinists. One part of the book, called “The Book of the Machines,” explains the origins of this war and presents the arguments of both sides.3 It is eerily prescient of the debate that has re-emerged in the early years of the twenty-first century.

反机械论者的主要论点是,机器将发展到人类失去控制的程度:

The anti-machinists’ main argument is that machines will advance to the point where humanity loses control:

我们自己不是也在创造地球至高无上的继承者吗?我们每天都在增加他们组织的美丽和精巧,每天都在赋予他们更高的技能,并提供越来越多的自我调节、自我行动能力,这些能力将比任何智力都要好?……随着时间的推移,我们会发现自己是劣等民族……

我们必须做出选择,要么忍受眼前的苦难,要么看着自己逐渐被自己的造物所取代,直到我们与他们相比不再高人一等,比野兽对待我们自己还要艰难……我们的束缚会不知不觉地、无声无息地降临到我们头上。

Are we not ourselves creating our successors in the supremacy of the earth? Daily adding to the beauty and delicacy of their organization, daily giving them greater skill and supplying more and more of that self-regulating self-acting power which will be better than any intellect? . . . In the course of ages we shall find ourselves the inferior race. . . .

We must choose between the alternative of undergoing much present suffering, or seeing ourselves gradually superseded by our own creatures, till we rank no higher in comparison with them, than the beasts of the field with ourselves. . . . Our bondage will steal upon us noiselessly and by imperceptible approaches.

叙述者还讲述了机械师支持者的主要反驳意见,这预示了我们将在下一章探讨的人机共生论点:

The narrator also relates the pro-machinists’ principal counterargument, which anticipates the man–machine symbiosis argument that we will explore in the next chapter:

只有一个人认真地尝试回答这个问题。作者说,机器应该被视为人类自身物理本质的一部分,实际上只不过是身体之外的肢体。

There was only one serious attempt to answer it. Its author said that machines were to be regarded as a part of man’s own physical nature, being really nothing but extra-corporeal limbs.

尽管埃瑞璜的反机械论者在争论中获胜,但巴特勒本人似乎持两种观点。一方面,他抱怨说“当一位哲学家出现在他们中间时,埃瑞璜人……很快就把常识献给逻辑的神殿,这位哲学家凭借其特殊学识的声誉将他们带走”,并说“他们在机械问题上自相残杀”。另一方面,他所描述的埃瑞璜社会非常和谐、富有成效,甚至像田园诗一样。埃瑞璜人完全接受重新踏上机械发明之路的愚蠢行为,并把博物馆里保存的那些机械残骸视为“英国古物学家对德鲁伊教纪念碑或燧石箭头的感觉”。

Although the anti-machinists in Erewhon win the argument, Butler himself appears to be of two minds. On the one hand, he complains that “Erewhonians are . . . quick to offer up common sense at the shrine of logic, when a philosopher arises among them, who carries them away through his reputation for especial learning” and says, “They cut their throats in the matter of machinery.” On the other hand, the Erewhonian society he describes is remarkably harmonious, productive, and even idyllic. The Erewhonians fully accept the folly of re-embarking on the course of mechanical invention, and regard those remnants of machinery kept in museums “with the feelings of an English antiquarian concerning Druidical monuments or flint arrow heads.”

艾伦·图灵显然知道巴特勒的故事,他在 1951 年曼彻斯特的一次演讲中思考了人工智能的长远未来:4

Butler’s story was evidently known to Alan Turing, who considered the long-term future of AI in a lecture given in Manchester in 1951:4

似乎一旦机器思维方法开始实施,不久它就会超越我们微弱的力量。机器不会死亡,它们可以相互交流以磨练智慧。因此,在某个阶段,我们应该期待机器以塞缪尔·巴特勒的《埃勒璜》中提到的方式接管控制权。

It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers. There would be no question of the machines dying, and they would be able to converse with each other to sharpen their wits. At some stage therefore we should have to expect the machines to take control, in the way that is mentioned in Samuel Butler’s Erewhon.

同年,图灵在 BBC 第三节目中向全英国播出的广播讲座中重申了这些担忧:

In the same year, Turing repeated these concerns in a radio lecture broadcast throughout the UK on the BBC Third Programme:

如果机器能够思考,那么它可能会比我们思考得更聪明,那么我们又该如何是好呢?即使我们能够让机器处于从属地位,例如在关键时刻关闭电源,我们作为一个物种也应该感到非常谦卑……这种新的危险……肯定会让我们感到焦虑。

If a machine can think, it might think more intelligently than we do, and then where should we be? Even if we could keep the machines in a subservient position, for instance by turning off the power at strategic moments, we should, as a species, feel greatly humbled. . . . This new danger . . . is certainly something which can give us anxiety.

当埃瑞璜的反机械论者“对未来感到极度不安”时,他们认为自己“有责任在我们还有能力的时候制止邪恶”,于是他们摧毁了所有的机器。图灵对“新的危险”和“焦虑”的反应是考虑“关闭电源”(尽管很快就会明白这不是一个真正的选择)。在弗兰克·赫伯特的经典科幻小说《沙丘》中,故事背景设定在遥远的未来,人类勉强在巴特勒圣战中幸存下来,这是一场与“思考机器”的灾难性战争。一条新的戒律出现了:“你不得制造与人类思维相似的机器。”这条戒律禁止任何类型的计算设备。

When the Erewhonian anti-machinists “feel seriously uneasy about the future,” they see it as their “duty to check the evil while we can still do so,” and they destroy all the machines. Turing’s response to the “new danger” and “anxiety” is to consider “turning off the power” (although it will be clear shortly that this is not really an option). In Frank Herbert’s classic science-fiction novel Dune, set in the far future, humanity has barely survived the Butlerian Jihad, a cataclysmic war with the “thinking machines.” A new commandment has emerged: “Thou shalt not make a machine in the likeness of a human mind.” This commandment precludes computing devices of any kind.

所有这些激烈的反应都反映了机器智能引发的早期恐惧。是的,超级智能机器的前景确实让人感到不安。是的,从逻辑上讲,这样的机器有可能统治世界,征服或消灭人类。如果这就是我们所能做的,那么目前我们唯一可行的反应就是试图限制人工智能研究——具体来说,就是禁止开发和部署通用的、人类级别的人工智能系统。

All these drastic responses reflect the inchoate fears that machine intelligence evokes. Yes, the prospect of superintelligent machines does make one uneasy. Yes, it is logically possible that such machines could take over the world and subjugate or eliminate the human race. If that is all one has to go on, then indeed the only plausible response available to us, at the present time, is to attempt to curtail artificial intelligence research—specifically, to ban the development and deployment of general-purpose, human-level AI systems.

像大多数其他人工智能研究人员一样,我对这种前景感到畏缩。谁敢告诉我什么可以想,什么不能想?任何提议终止人工智能研究的人都必须做大量说服工作。终止人工智能研究不仅意味着放弃了解人类智能如何运作的主要途径之一,而且还意味着放弃一个黄金机会。改善人类状况的机会——创造一个更好的文明。人类水平的人工智能的经济价值可以以数千万亿美元来衡量,因此企业和政府对人工智能研究的推动力可能会非常巨大。它将压倒哲学家的含糊反对意见,无论他或她的“特殊学识声誉”有多高,正如巴特勒所说。

Like most other AI researchers, I recoil at this prospect. How dare anyone tell me what I can and cannot think about? Anyone proposing an end to AI research is going to have to do a lot of convincing. Ending AI research would mean forgoing not just one of the principal avenues for understanding how human intelligence works but also a golden opportunity to improve the human condition—to make a far better civilization. The economic value of human-level AI is measurable in the thousands of trillions of dollars, so the momentum behind AI research from corporations and governments is likely to be enormous. It will overwhelm the vague objections of a philosopher, no matter how great his or her “reputation for especial learning,” as Butler puts it.

禁止通用人工智能的第二个缺点是,禁止通用人工智能是一件很难的事情。通用人工智能的进展主要发生在世界各地研究实验室的白板上,因为数学问题被提出和解决。我们事先不知道要禁止哪些想法和方程式,而且,即使我们知道,期望这样的禁令能够执行或有效似乎也不合理。

A second drawback to the idea of banning general-purpose AI is that it’s a difficult thing to ban. Progress on general-purpose AI occurs primarily on the whiteboards of research labs around the world, as mathematical problems are posed and solved. We don’t know in advance which ideas and equations to ban, and, even if we did, it doesn’t seem reasonable to expect that such a ban could be enforceable or effective.

让问题更加棘手的是,在通用人工智能方面取得进展的研究人员往往也在研究其他东西。正如我已经指出的那样,对工具人工智能(那些特定的、无害的应用,如玩游戏、医疗诊断和旅行计划)的研究往往会导致通用技术的进步,这些技术可适用于广泛的其他问题,并使我们更接近人类级别的人工智能。

To compound the difficulty still further, researchers making progress on general-purpose AI are often working on something else. As I have already argued, research on tool AI—those specific, innocuous applications such as game playing, medical diagnosis, and travel planning—often leads to progress on general-purpose techniques that are applicable to a wide range of other problems and move us closer to human-level AI.

出于这些原因,人工智能界(或控制法律和研究预算的政府和公司)不太可能通过停止人工智能的发展来应对大猩猩问题。如果大猩猩问题只能通过这种方式解决,那么它就无法解决。

For these reasons, it’s very unlikely that the AI community—or the governments and corporations that control the laws and research budgets—will respond to the gorilla problem by ending progress in AI. If the gorilla problem can be solved only in this way, it isn’t going to be solved.

唯一可行的方法就是了解为什么制造更好的人工智能可能是一件坏事。事实证明,我们几千年前就知道了答案。

The only approach that seems likely to work is to understand why it is that making better AI might be a bad thing. It turns out that we have known the answer for thousands of years.

迈达斯国王难题

The King Midas Problem

我们在第一章中认识的诺伯特·维纳对许多领域都有着深远的影响,包括人工智能、认知科学和控制理论。与他的大多数同时代人不同,他特别关注现实世界中复杂系统的不可预测性。(他在十岁时就写了第一篇关于这个主题的论文。)他开始相信,科学家和工程师对自己控制创作的能力过于自信,无论是军用还是民用,都可能带来灾难性的后果。

Norbert Wiener, whom we met in Chapter 1, had a profound impact on many fields, including artificial intelligence, cognitive science, and control theory. Unlike most of his contemporaries, he was particularly concerned with the unpredictability of complex systems operating in the real world. (He wrote his first paper on this topic at the age of ten.) He became convinced that the overconfidence of scientists and engineers in their ability to control their creations, whether military or civilian, could have disastrous consequences.

1950 年,维纳出版了《人之用》一书,5封面简介写道:“‘机械大脑’和类似的机器可以摧毁人类价值观,也可以让我们以前所未有的方式实现这些价值观。” 6随着时间的推移,他逐渐完善了自己的想法,到 1960 年,他确定了一个核心问题:不可能正确、完整地定义真正的人类目的。这反过来意味着,我所说的标准模型——人类试图将自己的目的赋予机器——注定会失败。

In 1950, Wiener published The Human Use of Human Beings,5 whose front-cover blurb reads, “The ‘mechanical brain’ and similar machines can destroy human values or enable us to realize them as never before.”6 He gradually refined his ideas over time and by 1960 had identified one core issue: the impossibility of defining true human purposes correctly and completely. This, in turn, means that what I have called the standard model—whereby humans attempt to imbue machines with their own purposes—is destined to fail.

我们可以称之为迈达斯国王问题:迈达斯是古希腊神话中的一位传奇国王,他得到了他想要的东西——他触摸的一切都变成了金子。但当他发现这包括他的食物、饮料和家人时已经太晚了,他死于痛苦和饥饿。同样的主题在人类神话中无处不在。维纳引用了歌德的魔法师学徒的故事,魔法师学徒指示扫帚取水——但没有说要取多少水,也不知道如何让扫帚停下来。

We might call this the King Midas problem: Midas, a legendary king in ancient Greek mythology, got exactly what he asked for—namely, that everything he touched should turn to gold. Too late, he discovered that this included his food, his drink, and his family members, and he died in misery and starvation. The same theme is ubiquitous in human mythology. Wiener cites Goethe’s tale of the sorcerer’s apprentice, who instructs the broom to fetch water—but doesn’t say how much water and doesn’t know how to make the broom stop.

从技术角度来说,我们可能遭受价值观不一致的困扰——我们可能无意中将与我们不完全一致的目标灌输给机器。直到最近,我们才免受潜在的灾难性后果的困扰,因为智能机器的能力有限,而且它们影响世界的范围也有限。(事实上,大多数人工智能工作都是在研究实验室里用玩具问题完成的。)正如诺伯特·维纳 (Norbert Wiener) 在其 1964 年出版的《上帝与魔像》一书中所说,7

A technical way of saying this is that we may suffer from a failure of value alignment—we may, perhaps inadvertently, imbue machines with objectives that are imperfectly aligned with our own. Until recently, we were shielded from the potentially catastrophic consequences by the limited capabilities of intelligent machines and the limited scope that they have to affect the world. (Indeed, most AI work was done with toy problems in research labs.) As Norbert Wiener put it in his 1964 book God and Golem,7

过去,对人类目的的片面和不充分的看法之所以相对无害,只是因为它伴随着技术限制……这只是人类的无能使我们免受人类愚蠢行为造成的全部破坏性影响的众多方面之一。

In the past, a partial and inadequate view of human purpose has been relatively innocuous only because it has been accompanied by technical limitations. . . . This is only one of the many places where human impotence has shielded us from the full destructive impact of human folly.

不幸的是,这一隔离期即将结束。

Unfortunately, this period of shielding is rapidly coming to an end.

我们已经看到社交媒体上的内容选择算法如何以最大化广告收入的名义对社会造成严重破坏。如果你认为广告收入最大化已经是一个卑鄙的目标,根本不应该追求,那么让我们假设我们要求未来的超级智能系统追求找到治愈癌症的方法这一崇高目标——最好是尽快,因为每 3.5 秒就会有一人死于癌症。在几个小时内,人工智能系统就阅读了整个生物医学文献,并推测了数百万种可能有效但之前未经测试的化合物。在几周内,它在每个活体人类身上诱发了多种不同类型的肿瘤,以便对这些化合物进行医学试验,这是找到治愈方法的最快方法。糟糕。

We have already seen how content-selection algorithms on social media wrought havoc on society in the name of maximizing ad revenues. In case you are thinking to yourself that ad revenue maximization was already an ignoble goal that should never have been pursued, let’s suppose instead that we ask some future superintelligent system to pursue the noble goal of finding a cure for cancer—ideally as quickly as possible, because someone dies from cancer every 3.5 seconds. Within hours, the AI system has read the entire biomedical literature and hypothesized millions of potentially effective but previously untested chemical compounds. Within weeks, it has induced multiple tumors of different kinds in every living human being so as to carry out medical trials of these compounds, this being the fastest way to find a cure. Oops.

如果你更喜欢解决环境问题,你可能会要求机器应对二氧化碳水平升高导致的海洋快速酸化。该机器开发了一种新型催化剂,可以促进海洋和大气之间发生极快的化学反应,并恢复海洋的 pH 值。不幸的是,大气中四分之一的氧气在此过程中被消耗掉,让我们慢慢地、痛苦地窒息。哎呀。

If you prefer solving environmental problems, you might ask the machine to counter the rapid acidification of the oceans that results from higher carbon dioxide levels. The machine develops a new catalyst that facilitates an incredibly rapid chemical reaction between ocean and atmosphere and restores the oceans’ pH levels. Unfortunately, a quarter of the oxygen in the atmosphere is used up in the process, leaving us to asphyxiate slowly and painfully. Oops.

这类世界末日场景并不隐晦——也许正如人们所预料的那样。但在许多场景中,一种精神窒息“悄无声息地、不知不觉地向我们袭来”。马克斯·泰格马克的《生命 3.0》序言详细描述了这样一个场景:超级智能机器逐渐掌控整个世界的经济和政治,而基本不被发现。互联网和它所支持的全球规模的机器——已经每天与数十亿“用户”互动——为机器对人类的控制的增长提供了完美的媒介。

These kinds of world-ending scenarios are unsubtle—as one might expect, perhaps, for world-ending scenarios. But there are many scenarios in which a kind of mental asphyxiation “steals upon us noiselessly and by imperceptible approaches.” The prologue to Max Tegmark’s Life 3.0 describes in some detail a scenario in which a superintelligent machine gradually assumes economic and political control over the entire world while remaining essentially undetected. The Internet and the global-scale machines that it supports—the ones that already interact with billions of “users” on a daily basis—provide the perfect medium for the growth of machine control over humans.

我不认为这些机器会有什么“统治世界”的目的。它更可能是利润最大化或参与度最大化,甚至可能是一个看似良性的目标,比如在常规用户幸福感调查中获得更高的分数或减少我们的能源使用。现在,如果我们把自己看作是那些期望通过行动实现目标的实体,那么有两种方法可以改变我们的行为。第一种是老式的方法:保持我们的期望和目标不变,但改变我们的环境——例如,通过提供金钱、用枪指着我们或让我们挨饿。这对计算机来说往往是昂贵而困难的。第二种方法是改变我们的期望和目标。这对机器来说要容易得多。它每天与你接触数小时,控制你对信息的访问,并通过游戏、电视、电影和社交互动为你提供大量娱乐。

I don’t expect that the purpose put into such machines will be of the “take over the world” variety. It is more likely to be profit maximization or engagement maximization or, perhaps, even an apparently benign goal such as achieving higher scores on regular user happiness surveys or reducing our energy usage. Now, if we think of ourselves as entities whose actions are expected to achieve our objectives, there are two ways to change our behavior. The first is the old-fashioned way: leave our expectations and objectives unchanged, but change our circumstances—for example, by offering money, pointing a gun at us, or starving us into submission. That tends to be expensive and difficult for a computer to do. The second way is to change our expectations and objectives. This is much easier for a machine. It is in contact with you for hours every day, controls your access to information, and provides much of your entertainment through games, TV, movies, and social interaction.

优化社交媒体点击率的强化学习算法无法推理人类行为——事实上,它们甚至不知道人类存在的任何意义。对于更了解人类心理、信仰和动机的机器来说,逐步引导我们朝着提高机器目标满意度的方向发展应该相对容易。例如,它可能会通过说服我们少生孩子来减少我们的能源消耗,最终——无意中——实现反生育主义哲学家的梦想,他们希望消除人类对自然界的有害影响。

The reinforcement learning algorithms that optimize social-media click-through have no capacity to reason about human behavior—in fact, they do not even know in any meaningful sense that humans exist. For machines with much greater understanding of human psychology, beliefs, and motivations, it should be relatively easy to gradually guide us in directions that increase the degree of satisfaction of the machine’s objectives. For example, it might reduce our energy consumption by persuading us to have fewer children, eventually—and inadvertently—achieving the dreams of anti-natalist philosophers who wish to eliminate the noxious impact of humanity on the natural world.

通过一些练习,你可以学会识别实现几乎任何固定目标可能导致任意糟糕结果的方式。最常见的模式之一是从目标中省略你真正关心的东西。在这种情况下——如上文给出的例子——人工智能系统通常会找到一个最佳解决方案,将您关心但忘记提及的事情设置为极值。因此,如果您对自动驾驶汽车说“尽快送我去机场!”并且它按字面意思理解,它将达到每小时 180 英里的速度,您将入狱。(幸运的是,目前正在考虑的自动驾驶汽车不会接受这样的要求。)如果您说“尽快送我去机场,但不要超过速度限制”,它将尽可能地加速和刹车,在车流中进进出出以保持中间的最大速度。它甚至可能会将其他车辆推开,以在机场航站楼的拥挤中赢得几秒钟。依此类推——最终,您将添加足够的考虑因素,以便汽车的驾驶大致接近熟练的人类驾驶员在匆忙中将某人送到机场。

With a bit of practice, you can learn to identify ways in which the achievement of more or less any fixed objective can result in arbitrarily bad outcomes. One of the most common patterns involves omitting something from the objective that you do actually care about. In such cases—as in the examples given above—the AI system will often find an optimal solution that sets the thing you do care about, but forgot to mention, to an extreme value. So, if you say to your self-driving car, “Take me to the airport as fast as possible!” and it interprets this literally, it will reach speeds of 180 miles per hour and you’ll go to prison. (Fortunately, the self-driving cars currently contemplated won’t accept such a request.) If you say, “Take me to the airport as fast as possible while not exceeding the speed limit,” it will accelerate and brake as hard as possible, swerving in and out of traffic to maintain the maximum speed in between. It may even push other cars out of the way to gain a few seconds in the scrum at the airport terminal. And so on—eventually, you will add enough considerations so that the car’s driving roughly approximates that of a skilled human driver taking someone to the airport in a bit of a hurry.

驾驶是一项简单的任务,只会产生局部影响,而目前为驾驶而构建的人工智能系统还不是很智能。出于这些原因,许多潜在的故障模式是可以预见的;其他故障模式将在驾驶模拟器中或在数百万英里的测试中显现出来,专业司机随时准备在出现问题时接管;还有一些故障模式将在稍后出现,即当汽车已经上路并发生一些奇怪的事情时。

Driving is a simple task with only local impacts, and the AI systems currently being built for driving are not very intelligent. For these reasons, many of the potential failure modes can be anticipated; others will reveal themselves in driving simulators or in millions of miles of testing with professional drivers ready to take over if something goes wrong; still others will appear only later, when the cars are already on the road and something weird happens.

不幸的是,超级智能系统可以产生全球性影响,但却没有模拟器,也没有重来的机会。对于普通人来说,预测并提前排除机器为实现特定目标可能选择的所有灾难性方式当然非常困难,甚至是不可能的。一般来说,如果你有一个目标,而超级智能机器有一个不同的、相互冲突的目标,那么机器会得到它想要的,而你得不到。

Unfortunately, with superintelligent systems that can have a global impact, there are no simulators and no do-overs. It’s certainly very hard, and perhaps impossible, for mere humans to anticipate and rule out in advance all the disastrous ways the machine could choose to achieve a specified objective. Generally speaking, if you have one goal and a superintelligent machine has a different, conflicting goal, the machine gets what it wants and you don’t.

恐惧与贪婪:工具性目标

Fear and Greed: Instrumental Goals

如果机器追求错误的目标听起来已经够糟糕了,那么还有更糟糕的。艾伦·图灵建议的解决方案是关闭战略时刻的权力——可能无法使用,原因很简单:如果你死了,你就无法去取咖啡

If a machine pursuing an incorrect objective sounds bad enough, there’s worse. The solution suggested by Alan Turing—turning off the power at strategic moments—may not be available, for a very simple reason: you can’t fetch the coffee if you’re dead.

让我解释一下。假设一台机器的目标是取咖啡。如果它足够智能,它肯定会明白,如果在完成任务之前关闭它,它的目标就会失败。因此,取咖啡的目标创造了一个必要的子目标,即禁用关闭开关的目标。治愈癌症或计算圆周率的数字也是如此。一旦你死了,你真的没什么可做的了,所以我们可以期待人工智能系统在或多或少有明确的目标的情况下先发制人地采取行动来保护自己的存在

Let me explain. Suppose a machine has the objective of fetching the coffee. If it is sufficiently intelligent, it will certainly understand that it will fail in its objective if it is switched off before completing its mission. Thus, the objective of fetching coffee creates, as a necessary subgoal, the objective of disabling the off-switch. The same is true for curing cancer or calculating the digits of pi. There’s really not a lot you can do once you’re dead, so we can expect AI systems to act preemptively to preserve their own existence, given more or less any definite objective.

如果这一目标与人类偏好相冲突,那么我们就会看到《2001:太空漫游》的情节,其中 HAL 9000 计算机杀死了飞船上五名宇航员中的四名,以防止干扰其任务。最后一位宇航员戴夫在一场史诗般的斗智斗勇之后设法关闭了 HAL——大概是为了让情节保持有趣。但如果 HAL 真的超级智能,戴夫就会被关掉。

If that objective is in conflict with human preferences, then we have exactly the plot of 2001: A Space Odyssey, in which the HAL 9000 computer kills four of the five astronauts on board the ship to prevent interference with its mission. Dave, the last remaining astronaut, manages to switch HAL off after an epic battle of wits—presumably to keep the plot interesting. But if HAL had been truly superintelligent, Dave would have been switched off.

重要的是要明白,自我保护不一定是机器的任何一种内置本能或主要指令。(因此,艾萨克·阿西莫夫的《机器人第三定律》8开头的“机器人必须保护自己的存在”是完全没有必要的。)没有必要建立自我保护,因为它是一个工具性目标——几乎是任何原始目标的有用子目标。9任何具有明确目标的实体都会自动表现得好像它也有工具性目标一样。

It is important to understand that self-preservation doesn’t have to be any sort of built-in instinct or prime directive in machines. (So Isaac Asimov’s Third Law of Robotics,8 which begins “A robot must protect its own existence,” is completely unnecessary.) There is no need to build self-preservation in because it is an instrumental goal—a goal that is a useful subgoal of almost any original objective.9 Any entity that has a definite objective will automatically act as if it also has instrumental goals.

除了活着之外,获得金钱也是我们当前系统中的一个工具性目标。因此,智能机器可能想要钱,不是因为它贪婪,而是因为钱对于实现各种目标有用。在电影《超验骇客》中,当约翰尼·德普的大脑被上传到量子超级计算机时,机器做的第一件事就是将自己复制到互联网上的数百万台其他计算机上,这样它就无法关闭。它做的第二件事该公司所做的就是在股市上赚得盆满钵满,以资助其扩张计划。

In addition to being alive, having access to money is an instrumental goal within our current system. Thus, an intelligent machine might want money, not because it’s greedy but because money is useful for achieving all sorts of goals. In the movie Transcendence, when Johnny Depp’s brain is uploaded into the quantum supercomputer, the first thing the machine does is copy itself onto millions of other computers on the Internet so that it cannot be switched off. The second thing it does is make a quick killing on the stock market to fund its expansion plans.

那么,这些扩张计划究竟是什么呢?它们包括设计和建造一台更大的量子超级计算机;进行人工智能研究;以及发现物理学、神经科学和生物学的新知识。这些资源目标——计算能力、算法和知识——也是工具性目标,有助于实现任何总体目标。10它们看似无害,直到人们意识到获取过程将无止境地持续下去。这似乎不可避免地会与人类发生冲突。当然,机器配备了越来越好的人类决策模型,将预测并击败我们在这场冲突中的每一个举动。

And what, exactly, are those expansion plans? They include designing and building a much larger quantum supercomputer; doing AI research; and discovering new knowledge of physics, neuroscience, and biology. These resource objectives—computing power, algorithms, and knowledge—are also instrumental goals, useful for achieving any overarching objective.10 They seem harmless enough until one realizes that the acquisition process will continue without limit. This seems to create inevitable conflict with humans. And of course, the machine, equipped with ever-better models of human decision making, will anticipate and defeat our every move in this conflict.

情报爆炸

Intelligence Explosions

IJ 古德是一位才华横溢的数学家,他曾在布莱切利园与阿兰·图灵共事,在二战期间破译了德国密码。他和图灵一样对机器智能和统计推断感兴趣。1965 年,他撰写了如今最著名的论文《关于第一台超级智能机器的推测》。11论文的第一句话表明,古德对冷战的核边缘政策感到震惊,将人工智能视为人类的潜在救星:“人类的生存取决于超级智能机器的早期建造。”然而,随着论文的推进,他变得更加谨慎。他引入了智能爆炸的概念,但和之前的巴特勒、图灵和维纳一样,他担心失去控制:

I. J. Good was a brilliant mathematician who worked with Alan Turing at Bletchley Park, breaking German codes during World War II. He shared Turing’s interests in machine intelligence and statistical inference. In 1965, he wrote what is now his best-known paper, “Speculations Concerning the First Ultraintelligent Machine.”11 The first sentence suggests that Good, alarmed by the nuclear brinkmanship of the Cold War, regarded AI as a possible savior for humanity: “The survival of man depends on the early construction of an ultraintelligent machine.” As the paper proceeds, however, he becomes more circumspect. He introduces the notion of an intelligence explosion, but, like Butler, Turing, and Wiener before him, he worries about losing control:

让超智能机器被定义为能够远远超越任何人的所有智力活动的机器,无论他们多么聪明。由于机器的设计是这些智力活动之一,超智能机器可以设计出更好的机器;毫无疑问,这将带来“智能爆炸”。而人类的智慧将远远落后。因此,第一台超智能机器是人类需要做的最后一项发明,前提是这台机器足够温顺,可以告诉我们如何控制它。奇怪的是,除了科幻小说外,这一点很少被提及。

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control. It is curious that this point is made so seldom outside science fiction.

这一段是任何关于超级智能 AI 的讨论中必不可少的内容,尽管结尾的警告通常被忽略了。Good 的观点可以通过指出以下事实得到加强:超级智能机器不仅可以改进自己的设计;它很可能会这样做因为正如我们所见,智能机器希望从改进硬件和软件中获益。智能爆炸的可能性经常被认为是 AI 对人类的主要风险来源,因为它会给我们留下很少的时间来解决控制问题。12

This paragraph is a staple of any discussion of superintelligent AI, although the caveats at the end are usually left out. Good’s point can be strengthened by noting that not only could the ultraintelligent machine improve its own design; it’s likely that it would do so because, as we have seen, an intelligent machine expects to benefit from improving its hardware and software. The possibility of an intelligence explosion is often cited as the main source of risk to humanity from AI because it would give us so little time to solve the control problem.12

古德的论证当然具有合理性,因为它与化学爆炸有着自然的类比,在化学爆炸中,每个分子反应释放的能量足以引发不止一个额外的反应。另一方面,从逻辑上讲,智力改进的收益可能会递减,因此这个过程会逐渐结束而不是爆发。13没有明显的方法可以证明爆炸必然会发生

Good’s argument certainly has plausibility via the natural analogy to a chemical explosion in which each molecular reaction releases enough energy to initiate more than one additional reaction. On the other hand, it is logically possible that there are diminishing returns to intelligence improvements, so that the process peters out rather than exploding.13 There’s no obvious way to prove that an explosion will necessarily occur.

收益递减情景本身就很有趣。如果随着机器变得更加智能,实现给定百分比的改进变得更加困难,那么收益递减情景就可能出现。(为了论证的目的,我假设通用机器智能可以在某种线性尺度上进行测量,但我怀疑这永远不会完全正确。)在这种情况下,人类也无法创造超级智能。如果一台已经是超人的机器在试图提高自身智能时精疲力竭,那么人类就会更快精疲力竭。

The diminishing-returns scenario is interesting in its own right. It could arise if it turns out that achieving a given percentage improvement becomes much harder as the machine becomes more intelligent. (I’m assuming for the sake of argument that general-purpose machine intelligence is measurable on some kind of linear scale, which I doubt will ever be strictly true.) In that case, humans won’t be able to create superintelligence either. If a machine that is already superhuman runs out of steam when trying to improve its own intelligence, then humans will run out of steam even sooner.

现在,我从未听过任何严肃的论据,说创造任何特定水平的机器智能都超出了人类智慧的能力,但我想人们必须承认这在逻辑上可能。当然,“逻辑上可能”和“我愿意将人类的未来押注于此”是完全不同的两件事。押注人类的智慧似乎是一种失败的策略。

Now, I’ve never heard a serious argument to the effect that creating any given level of machine intelligence is simply beyond the capacity of human ingenuity, but I suppose one must concede it’s logically possible. “Logically possible” and “I’m willing to bet the future of the human race on it” are, of course, two completely different things. Betting against human ingenuity seems like a losing strategy.

如果真的发生了智能爆炸,而我们还没有解决控制智能略超人类的机器的问题(例如,如果我们无法阻止它们进行这些递归式的自我改进),那么我们将没有时间解决控制问题,游戏将结束。这是博斯特罗姆的硬起飞场景,其中机器的智能在短短几天或几周内大幅提升。用图灵的话来说,这“肯定是让我们感到焦虑的事情”。

If an intelligence explosion does occur, and if we have not already solved the problem of controlling machines with only slightly superhuman intelligence—for example, if we cannot prevent them from making these recursive self-improvements—then we would have no time left to solve the control problem and the game would be over. This is Bostrom’s hard takeoff scenario, in which the machine’s intelligence increases astronomically in just days or weeks. In Turing’s words, it is “certainly something which can give us anxiety.”

对这种焦虑的可能反应似乎是退出人工智能研究,否认开发先进人工智能所固有的风险,通过设计必然受人类控制的人工智能系统来理解和降低风险,以及辞职——简单地将未来拱手让给智能机器。

The possible responses to this anxiety seem to be to retreat from AI research, to deny that there are risks inherent in developing advanced AI, to understand and mitigate the risks through the design of AI systems that necessarily remain under human control, and to resign—simply to cede the future to intelligent machines.

本书的其余部分将讨论否认和缓和。正如我已经指出的那样,退出人工智能研究既不太可能发生(因为放弃的利益太大),也很难实现。辞职似乎是最糟糕的反应。它常常伴随着这样的想法:比我们更聪明的人工智能系统应该继承这个星球,让人类平静地走向那个美好的夜晚,想到我们聪明的电子后代正忙于实现他们的目标,人类就会感到安慰。这种观点是由机器人专家和未来学家汉斯·莫拉维克14提出的,他写道:“无边无际的网络空间将充满非人类的超心智,他们从事的事务对人类来说就像我们对细菌一样。”这似乎是一个错误。对人类来说,价值主要由有意识的人类经验来定义。如果没有人类,也没有其他有意识的实体,他们的主观经验对我们来说很重要,那么就不会发生任何有价值的事情。

Denial and mitigation are the subjects of the remainder of the book. As I have already argued, retreat from AI research is both unlikely to happen (because the benefits forgone are too great) and very difficult to bring about. Resignation seems to be the worst possible response. It is often accompanied by the idea that AI systems that are more intelligent than us somehow deserve to inherit the planet, leaving humans to go gentle into that good night, comforted by the thought that our brilliant electronic progeny are busy pursuing their objectives. This view was promulgated by the roboticist and futurist Hans Moravec,14 who writes, “The immensities of cyberspace will be teeming with unhuman superminds, engaged in affairs that are to human concerns as ours are to those of bacteria.” This seems to be a mistake. Value, for humans, is defined primarily by conscious human experience. If there are no humans and no other conscious entities whose subjective experience matters to us, there is nothing of value occurring.

6

6

不太精彩的人工智能辩论

THE NOT-SO-GREAT AI DEBATE

引入第二个智慧物种到地球上的影响深远,值得我们深思熟虑。” 1《经济学人》杂志对尼克·博斯特罗姆的《超级智能》的评论就是这样结束的。大多数人会将此解读为英国式轻描淡写的典型例子。当然,你可能会想,当今的伟大思想家们已经在进行这种深思熟虑了——进行严肃的辩论,权衡风险和收益,寻求解决方案,找出解决方案的漏洞,等等。据我所知,还没有。

The implications of introducing a second intelligent species onto Earth are far-reaching enough to deserve hard thinking.”1 So ended The Economist magazine’s review of Nick Bostrom’s Superintelligence. Most would interpret this as a classic example of British understatement. Surely, you might think, the great minds of today are already doing this hard thinking—engaging in serious debate, weighing up the risks and benefits, seeking solutions, ferreting out loopholes in solutions, and so on. Not yet, as far as I am aware.

当人们第一次向技术受众介绍这些想法时,你可以看到思维泡泡从他们的脑海中冒出来,以“但是,但是,但是……”开头,以感叹号结尾。

When one first introduces these ideas to a technical audience, one can see the thought bubbles popping out of their heads, beginning with the words “But, but, but . . .” and ending with exclamation marks.

第一种“但是”以否认的形式出现。否认者说:“但这不可能是一个真正的问题,因为 XYZ。”有些 XYZ 反映了一种可以被宽容地描述为一厢情愿的推理过程,而其他一些则更为实质性。第二种“但是”以转移话题的形式出现:承认问题是真实的,但认为我们不应该试图解决它们,因为它们是无法解决的,或者因为有比文明的终结更重要的事情需要关注,或者因为最好根本不要提及它们。第三种“但是”采取了一种过于简单、即时的解决方案:“但我们不能只做 ABC 吗?”与否认一样,有些 ABC 会让人立刻感到后悔。其他一些,也许是偶然的,更接近于识别问题的真正本质。

The first kind of but takes the form of denial. The deniers say, “But this can’t be a real problem, because XYZ.” Some of the XYZs reflect a reasoning process that might charitably be described as wishful thinking, while others are more substantial. The second kind of but takes the form of deflection: accepting that the problems are real but arguing that we shouldn’t try to solve them, either because they’re unsolvable or because there are more important things to focus on than the end of civilization or because it’s best not to mention them at all. The third kind of but takes the form of an oversimplified, instant solution: “But can’t we just do ABC?” As with denial, some of the ABCs are instantly regrettable. Others, perhaps by accident, come closer to identifying the true nature of the problem.

我并不是说,对于设计不良的超级智能机器将对人类造成严重威胁这一观点,没有任何合理的反对意见。只是我还没有看到这样的反对意见。由于这个问题似乎非常重要,它值得进行最高质量的公开辩论。因此,为了进行这场辩论,并希望读者能参与其中,让我快速回顾一下迄今为止的亮点。

I don’t mean to suggest that there cannot be any reasonable objections to the view that poorly designed superintelligent machines would present a serious risk to humanity. It’s just that I have yet to see such an objection. Since the issue seems to be so important, it deserves a public debate of the highest quality. So, in the interests of having that debate, and in the hope that the reader will contribute to it, let me provide a quick tour of the highlights so far, such as they are.

拒绝

Denial

否认问题的存在是最简单的出路。Slate Star Codex博客的作者 Scott Alexander 在一篇关于人工智能风险的著名文章中这样开头:2 “我第一次对人工智能风险感兴趣是在 2007 年左右。当时,大多数人对这个话题的反应是‘哈哈,等除了互联网上的怪人之外还有人相信这个的时候再回来吧。’”

Denying that the problem exists at all is the easiest way out. Scott Alexander, author of the Slate Star Codex blog, began a well-known article on AI risk as follows:2 “I first became interested in AI risk back around 2007. At the time, most people’s response to the topic was ‘Haha, come back when anyone believes this besides random Internet crackpots.’”

令人遗憾的言论

Instantly regrettable remarks

如果一个人的终身职业受到威胁,那么一个非常聪明、通常深思熟虑的人可能会说出一些他们希望在进一步分析后收回的话。既然如此,我不会说出以下论点的作者的名字,他们都是著名的人工智能研究人员。我附上了对这些论点的反驳,尽管这些反驳是完全没有必要的。

A perceived threat to one’s lifelong vocation can lead a perfectly intelligent and usually thoughtful person to say things they might wish to retract on further analysis. That being the case, I will not name the authors of the following arguments, all of whom are well-known AI researchers. I’ve included refutations of the arguments, even though they are quite unnecessary.

  • 电子计算器的算术能力超人。计算器并没有统治世界;因此,没有理由担心超人人工智能。

    • 反驳:智力不等同于算术,计算器的算术能力并不能让它们统治世界。

  • Electronic calculators are superhuman at arithmetic. Calculators didn’t take over the world; therefore, there is no reason to worry about superhuman AI.

    • Refutation: intelligence is not the same as arithmetic, and the arithmetic ability of calculators does not equip them to take over the world.

  • 马具有超人的力量,我们不必担心证明马是安全的;所以我们不必担心证明人工智能系统是安全的。

    • 反驳:智慧并不等同于体力,马的力量并不能让它们征服世界

  • Horses have superhuman strength, and we don’t worry about proving that horses are safe; so we needn’t worry about proving that AI systems are safe.

    • Refutation: intelligence is not the same as physical strength, and the strength of horses does not equip them to take over the world.

  • 从历史上看,机器杀死数百万人类的例子为零,因此,推而广之,这种事情在未来不会发生。

    • 反驳:凡事都有第一次,在此之前没有发生过的例子

  • Historically, there are zero examples of machines killing millions of humans, so, by induction, it cannot happen in the future.

    • Refutation: there’s a first time for everything, before which there were zero examples of it happening.

  • 宇宙中没有任何物理量是无限的,包括智能,因此对超级智能的担忧被夸大了。

    • 反驳:超级智能不需要无限才会有问题;物理学使得计算设备的功能比人脑强大数十亿倍。

  • No physical quantity in the universe can be infinite, and that includes intelligence, so concerns about superintelligence are overblown.

    • Refutation: superintelligence doesn’t need to be infinite to be problematic; and physics allows computing devices billions of times more powerful than the human brain.

  • 我们并不担心物种灭绝,而是担心黑洞在近地轨道上出现等极不可能发生的事情,那么为什么要担心超级智能 AI 呢?

    • 反驳:如果地球上大多数物理学家都在努力制造这样的黑洞,我们难道不会问他们这是否安全吗?

  • We don’t worry about species-ending but highly unlikely possibilities such as black holes materializing in near-Earth orbit, so why worry about superintelligent AI?

    • Refutation: if most physicists on Earth were working to make such black holes, wouldn’t we ask them if it was safe?

情况很复杂

It’s complicated

现代心理学认为,单一的智商数字无法代表人类智力的全部丰富程度。3理论认为,智力有多种维度:空间智力、逻辑智力、语言智力、社交智力等等。第 2 章中的足球运动员 Alice 的空间智力可能比她的朋友 Bob 高,但社交智力却低。因此,我们无法严格按照智力顺序排列所有人。

It is a staple of modern psychology that a single IQ number cannot characterize the full richness of human intelligence.3 There are, the theory says, different dimensions of intelligence: spatial, logical, linguistic, social, and so on. Alice, our soccer player from Chapter 2, might have more spatial intelligence than her friend Bob, but less social intelligence. Thus, we cannot line up all humans in strict order of intelligence.

对于机器来说更是如此,因为它们的能力范围要窄得多。谷歌搜索引擎和 AlphaGo 几乎没有任何共同之处,除了它们都是同一家母公司旗下两个子公司的产品,所以说其中一个比另一个更聪明是没有道理的。这使得“机器智商”的概念变得有问题,并表明将未来描述为人类和机器之间单向的智商竞赛是误导性的。

This is even more true of machines, because their abilities are much narrower. The Google search engine and AlphaGo have almost nothing in common, besides being products of two subsidiaries of the same parent corporation, and so it makes no sense to say that one is more intelligent than the other. This makes notions of “machine IQ” problematic and suggests that it’s misleading to describe the future as a one-dimensional IQ race between humans and machines.

《连线》杂志创始编辑、极具洞察力的科技评论员凯文·凯利 (Kevin Kelly) 将这一论点更进一步。他在《超人人工智能的神话》4中写道:“智能不是单一维度,因此‘比人类更聪明’是一个毫无意义的概念。”一举消除了对超级智能的所有担忧。

Kevin Kelly, founding editor of Wired magazine and a remarkably perceptive technology commentator, takes this argument one step further. In “The Myth of a Superhuman AI,”4 he writes, “Intelligence is not a single dimension, so ‘smarter than humans’ is a meaningless concept.” In a single stroke, all concerns about superintelligence are wiped away.

现在,一个显而易见的回应是,机器可以在所有相关的智力维度上超越人类的能力。在这种情况下,即使按照凯利的严格标准,机器也会比人类更聪明。但这个相当强烈的假设对于反驳凯利的论点来说并不是必要的。以黑猩猩为例。黑猩猩的短期记忆可能比人类更好,即使在以人类为导向的任务(例如回忆数字序列)上也是如此。5短期记忆是智力的一个重要维度。根据凯利的论点,人类并不比黑猩猩聪明;事实上,他会声称“比黑猩猩聪明”是一个毫无意义的概念。这对黑猩猩(以及倭黑猩猩、大猩猩、猩猩、鲸鱼、海豚等)来说是冷淡的安慰,它们的物种之所以幸存,只是因为我们的允许。对于所有那些已经被我们消灭的物种来说,这更是一种冷淡的安慰。这对可能担心被机器消灭的人类来说也是冷淡的安慰。

Now, one obvious response is that a machine could exceed human capabilities in all relevant dimensions of intelligence. In that case, even by Kelly’s strict standards, the machine would be smarter than a human. But this rather strong assumption is not necessary to refute Kelly’s argument. Consider the chimpanzee. Chimpanzees probably have better short-term memory than humans, even on human-oriented tasks such as recalling sequences of digits.5 Short-term memory is an important dimension of intelligence. By Kelly’s argument, then, humans are not smarter than chimpanzees; indeed, he would claim that “smarter than a chimpanzee” is a meaningless concept. This is cold comfort to the chimpanzees (and bonobos, gorillas, orangutans, whales, dolphins, and so on) whose species survive only because we deign to allow it. It is colder comfort still to all those species that we have already wiped out. It’s also cold comfort to humans who might be worried about being wiped out by machines.

这不可能

It’s impossible

甚至在 1956 年人工智能诞生之前,一些权威的知识分子就曾嗤之以鼻,声称智能机器是不可能存在的。艾伦·图灵在 1950 年发表了一篇开创性的论文《计算机器与智能》,其中大部分内容都用于反驳这些论点。从那时起,人工智能界就一直在反驳哲学家、6 数学家、7 等提出的类似不可能论断当前关于超级智能的争论中,一些哲学家挖掘了这些不可能论断,以证明人类没有什么可害怕的。8、9并不奇怪。

Even before the birth of AI in 1956, august intellectuals were harrumphing and saying that intelligent machines were impossible. Alan Turing devoted much of his seminal 1950 paper, “Computing Machinery and Intelligence,” to refuting these arguments. Ever since, the AI community has been fending off similar claims of impossibility from philosophers,6 mathematicians,7 and others. In the current debate over superintelligence, several philosophers have exhumed these impossibility claims to prove that humanity has nothing to fear.8,9 This comes as no surprise.

人工智能百年研究(AI100)是斯坦福大学开展的一项雄心勃勃的长期项目。该项目的目标是跟踪人工智能的发展,或者更准确地说,是“研究和预测人工智能将如何影响人们工作、生活和娱乐的各个方面”。该项目的第一份重要报告《2030 年的人工智能与生活》确实令人惊讶。10正如人们所料,它强调了人工智能在医疗诊断和汽车安全等领域的优势。令人意外的是,报告声称“与电影中不同,超人机器人竞赛并不会出现,甚至可能根本不可能出现”。

The One Hundred Year Study on Artificial Intelligence, or AI100, is an ambitious, long-term project housed at Stanford University. Its goal is to keep track of AI, or, more precisely, to “study and anticipate how the effects of artificial intelligence will ripple through every aspect of how people work, live and play.” Its first major report, “Artificial Intelligence and Life in 2030,” does come as a surprise.10 As might be expected, it emphasizes the benefits of AI in areas such as medical diagnosis and automotive safety. What’s unexpected is the claim that “unlike in the movies, there is no race of superhuman robots on the horizon or probably even possible.”

据我所知,这是严肃的人工智能研究人员首次公开支持“达到人类水平或超越人类的人工智能是不可能的”的观点——而且这还是在人工智能研究进展极快、一个又一个障碍被突破的时期。这就像一群顶尖的癌症生物学家宣布他们一直在愚弄我们:他们一直都知道癌症永远不会被治愈。

To my knowledge, this is the first time that serious AI researchers have publicly espoused the view that human-level or superhuman AI is impossible—and this in the middle of a period of extremely rapid progress in AI research, when barrier after barrier is being breached. It’s as if a group of leading cancer biologists announced that they had been fooling us all along: they’ve always known that there will never be a cure for cancer.

是什么促使了这种突然转变?该报告没有提供任何论据或证据。(事实上,有什么证据表明,没有任何一种原子的物理排列能胜过人脑?)我怀疑有两个原因。首先是自然的愿望,即证明大猩猩问题不存在,这这对人工智能研究人员来说是一个令人非常不安的前景;当然,如果人类级别的人工智能不可能实现,那么大猩猩问题就迎刃而解了。第二个原因是部落主义——一种团结一致、共同对抗被认为是对人工智能的“攻击”的本能。

What could have motivated such a volte-face? The report provides no arguments or evidence whatever. (Indeed, what evidence could there be that no physically possible arrangement of atoms outperforms the human brain?) I suspect there are two reasons. The first is the natural desire to disprove the existence of the gorilla problem, which presents a very uncomfortable prospect for the AI researcher; certainly, if human-level AI is impossible, the gorilla problem is neatly dispatched. The second reason is tribalism—the instinct to circle the wagons against what are perceived to be “attacks” on AI.

认为超级人工智能是可能的这一说法是对人工智能的攻击似乎很奇怪,而用人工智能永远不会实现其目标来为人工智能辩护则更奇怪。我们不能仅仅通过押注人类的智慧来避免未来的灾难。

It seems odd to perceive the claim that superintelligent AI is possible as an attack on AI, and even odder to defend AI by saying that AI will never succeed in its goals. We cannot insure against future catastrophe simply by betting against human ingenuity.

我们以前也打过这样的赌注,但都输了。正如我们前面所看到的,20 世纪 30 年代早期以卢瑟福勋爵为代表的物理学界坚信提取原子能是不可能的;然而,利奥·西拉德在 1933 年发明了中子诱发的核链式反应,证明这种信心是错误的。

We have made such bets before and lost. As we saw earlier, the physics establishment of the early 1930s, personified by Lord Rutherford, confidently believed that extracting atomic energy was impossible; yet Leo Szilard’s invention of the neutron-induced nuclear chain reaction in 1933 proved that confidence to be misplaced.

西拉德的突破发生在一个不幸的时期:与纳粹德国的军备竞赛开始了。当时,没有可能开发出造福大众的核技术。几年后,在实验室演示了核链式反应后,西拉德写道:“我们关掉所有东西,回家了。那天晚上,我心中几乎毫不怀疑,世界将走向悲痛。”

Szilard’s breakthrough came at an unfortunate time: the beginning of an arms race with Nazi Germany. There was no possibility of developing nuclear technology for the greater good. A few years later, having demonstrated a nuclear chain reaction in his laboratory, Szilard wrote, “We switched everything off and went home. That night, there was very little doubt in my mind that the world was headed for grief.”

现在担心还为时过早

It’s too soon to worry about it

经常可以看到头脑清醒的人试图缓解公众的担忧,他们指出,由于人类级别的人工智能在未来几十年内不太可能出现,因此无需担心。例如,AI100 报告称,“没有理由担心人工智能对人类构成迫在眉睫的威胁。”

It’s common to see sober-minded people seeking to assuage public concerns by pointing out that because human-level AI is not likely to arrive for several decades, there is nothing to worry about. For example, the AI100 report says there is “no cause for concern that AI is an imminent threat to humankind.”

这一论点在两个方面都站不住脚。首先,它攻击了一个稻草人。人们担心的原因并不在于迫在眉睫。例如,尼克·博斯特罗姆在《超级智能》一书中写道:“这本书的论点中没有提到我们即将在人工智能领域取得重大突破,或者我们能够准确预测这种发展何时会发生。”第二,长期风险仍可能引起短期担忧。担忧人类潜在严重问题的正确时机不仅取决于问题何时发生,还取决于准备和实施解决方案需要多长时间。

This argument fails on two counts. The first is that it attacks a straw man. The reasons for concern are not predicated on imminence. For example, Nick Bostrom writes in Superintelligence, “It is no part of the argument in this book that we are on the threshold of a big breakthrough in artificial intelligence, or that we can predict with any precision when such a development might occur.” The second is that a long-term risk can still be cause for immediate concern. The right time to worry about a potentially serious problem for humanity depends not just on when the problem will occur but also on how long it will take to prepare and implement a solution.

例如,如果我们在 2069 年探测到一颗大型小行星即将与地球相撞,我们会说现在担心还为时过早吗?恰恰相反!全球将开展一项紧急项目,以开发应对威胁的方法。我们不会等到 2068 年才开始寻找解决方案,因为我们无法提前知道需要多少时间。事实上,NASA 的行星防御项目已经在研究可能的解决方案,尽管“在未来 100 年内,没有一颗已知的小行星会对地球造成重大撞击风险”。为了防止你感到自满,他们还说:“大约 74% 的直径超过 460 英尺的近地物体仍有待发现。”

For example, if we were to detect a large asteroid on course to collide with Earth in 2069, would we say it’s too soon to worry? Quite the opposite! There would be a worldwide emergency project to develop the means to counter the threat. We wouldn’t wait until 2068 to start working on a solution, because we can’t say in advance how much time is needed. Indeed, NASA’s Planetary Defense project is already working on possible solutions, even though “no known asteroid poses a significant risk of impact with Earth over the next 100 years.” In case that makes you feel complacent, they also say, “About 74 percent of near-Earth objects larger than 460 feet still remain to be discovered.”

如果我们考虑到气候变化带来的全球灾难性风险(预计本世纪晚些时候发生),那么现在采取行动防止这些风险是否为时过早?恰恰相反,可能为时已晚。超人人工智能的相关时间尺度不太可预测,但这当然意味着,它可能会像核裂变一样,比预期更早到来。

And if we consider the global catastrophic risks from climate change, which are predicted to occur later in this century, is it too soon to take action to prevent them? On the contrary, it may be too late. The relevant time scale for superhuman AI is less predictable, but of course that means it, like nuclear fission, might arrive considerably sooner than expected.

“现在担心还为时过早”这一论点的一种表述已经广为流传,吴恩达 (Andrew Ng) 的论断是“这就像担心火星人口过剩一样。” 11(他后来将火星升级为半人马座阿尔法星。)吴恩达曾是斯坦福大学的教授,是机器学习领域的顶尖专家,他的观点很有分量。这一论断可以类比:不仅风险易于管理且在遥远的未来,而且我们极不可能首先尝试将数十亿人类移居火星。然而,这种类比是错误的。我们已经投入了大量的科学和技术资源来创建功能越来越强大的人工智能系统,却很少考虑如果我们成功了会发生什么。那么,一个更恰当的类比是制定一项计划,将人类移居火星火星上没有考虑我们到达火星后会呼吸什么、喝什么或吃什么。有些人可能会说这个计划不明智。或者,人们可以从字面上理解吴的观点,并回应说,即使只有一个人登陆火星也会造成人口过剩,因为火星的承载能力为零。因此,目前计划将少数人类送上火星的团体担心火星人口过剩,这就是他们开发生命支持系统的原因。

One formulation of the “it’s too soon to worry” argument that has gained currency is Andrew Ng’s assertion that “it’s like worrying about overpopulation on Mars.”11 (He later upgraded this from Mars to Alpha Centauri.) Ng, a former Stanford professor, is a leading expert on machine learning, and his views carry some weight. The assertion appeals to a convenient analogy: not only is the risk easily managed and far in the future but also it’s extremely unlikely we’d even try to move billions of humans to Mars in the first place. The analogy is a false one, however. We are already devoting huge scientific and technical resources to creating ever-more-capable AI systems, with very little thought devoted to what happens if we succeed. A more apt analogy, then, would be working on a plan to move the human race to Mars with no consideration for what we might breathe, drink, or eat once we arrive. Some might call this plan unwise. Alternatively, one could take Ng’s point literally, and respond that landing even a single person on Mars would constitute overpopulation, because Mars has a carrying capacity of zero. Thus, groups that are currently planning to send a handful of humans to Mars are worrying about overpopulation on Mars, which is why they are developing life-support systems.

我们是专家

We’re the experts

在每次讨论技术风险时,支持技术的阵营都会声称,所有对风险的担忧都源于无知。例如,艾伦人工智能研究所首席执行官、机器学习和自然语言理解领域著名研究员 Oren Etzioni 的观点如下:12

In every discussion of technological risk, the pro-technology camp wheels out the claim that all concerns about risk arise from ignorance. For example, here’s Oren Etzioni, CEO of the Allen Institute for AI and a noted researcher in machine learning and natural language understanding:12

每次技术创新兴起时,人们都会感到恐惧。从工业时代初期织布工将鞋子扔进机械织布机,到今天对杀手机器人的恐惧,我们的反应都是因为不知道新技术会对我们的自我意识和生活产生什么影响。当我们不知道时,我们恐惧的头脑就会填补细节。

At the rise of every technology innovation, people have been scared. From the weavers throwing their shoes in the mechanical looms at the beginning of the industrial era to today’s fear of killer robots, our response has been driven by not knowing what impact the new technology will have on our sense of self and our livelihoods. And when we don’t know, our fearful minds fill in the details.

《大众科学》发表了一篇题为《比尔盖茨害怕人工智能,但人工智能研究人员更了解》的文章:13

Popular Science published an article titled “Bill Gates Fears AI, but AI Researchers Know Better”:13

当你与人工智能研究人员交谈时——再次强调,真正的人工智能研究人员,那些努力让系统能够正常工作,更不用说让系统运行得非常好的人——他们并不担心超级智能会悄悄靠近他们,无论是现在还是将来。与马斯克似乎有意讲述的恐怖故事相反,人工智能研究人员并没有疯狂地安装防火墙召唤室和自毁倒计时装置。

When you talk to A.I. researchers—again, genuine A.I. researchers, people who grapple with making systems that work at all, much less work too well—they are not worried about superintelligence sneaking up on them, now or in the future. Contrary to the spooky stories that Musk seems intent on telling, A.I. researchers aren’t frantically installing firewalled summoning chambers and self-destruct countdowns.

该分析基于四人样本,事实上,他们在采访中都表示,人工智能的长期安全是一个重要问题。

This analysis was based on a sample of four, all of whom in fact said in their interviews that the long-term safety of AI was an important issue.

时任 IBM 副总裁的戴维·肯尼 (David Kenny) 向美国国会写了一封信,信中使用了与《大众科学》文章非常相似的语言,其中包括以下令人安心的话: 14

Using very similar language to the Popular Science article, David Kenny, at that time a vice president at IBM, wrote a letter to the US Congress that included the following reassuring words:14

当你真正进行机器智能科学研究,并将其真正应用于现实世界的商业和社会时——正如我们 IBM 创建我们的先驱认知计算系统 Watson 所做的那样——你就会明白,这项技术并不支持当今人工智能辩论中普遍存在的恐慌言论。

When you actually do the science of machine intelligence, and when you actually apply it in the real world of business and society—as we have done at IBM to create our pioneering cognitive computing system, Watson—you understand that this technology does not support the fear-mongering commonly associated with the AI debate today.

这三种情况传达的信息都是一样的:“别听他们的;我们才是专家。”现在,有人可能会指出,这其实是一种人身攻击的论点,试图通过剥夺信使的合法性来反驳这一信息,但即使从表面上看,这种论点也站不住脚。埃隆·马斯克、斯蒂芬·霍金和比尔·盖茨当然非常熟悉科学和技术推理,尤其是马斯克和盖茨监督和投资了许多人工智能研究项目。而说艾伦·图灵、IJ·古德、诺伯特·维纳和马文·明斯基没有资格讨论人工智能就更不可信了。最后,斯科特·亚历山大之前提到的博客文章,题为“人工智能研究人员谈人工智能风险”,指出“人工智能研究人员,包括该领域的一些领导者,从一开始就在提出人工智能风险和超级智能问题方面发挥了重要作用。”他列出了几位这样的研究人员,现在名单要长得多。

The message is the same in all three cases: “Don’t listen to them; we’re the experts.” Now, one can point out that this is really an ad hominem argument that attempts to refute the message by delegitimizing the messengers, but even if one takes it at face value, the argument doesn’t hold water. Elon Musk, Stephen Hawking, and Bill Gates are certainly very familiar with scientific and technological reasoning, and Musk and Gates in particular have supervised and invested in many AI research projects. And it would be even less plausible to argue that Alan Turing, I. J. Good, Norbert Wiener, and Marvin Minsky are unqualified to discuss AI. Finally, Scott Alexander’s blog piece mentioned earlier, which is titled “AI Researchers on AI Risk,” notes that “AI researchers, including some of the leaders in the field, have been instrumental in raising issues about AI risk and superintelligence from the very beginning.” He lists several such researchers, and the list is now much longer.

“人工智能捍卫者”的另一个标准修辞手法是将他们的对手描述为卢德分子。奥伦·埃齐奥尼 (Oren Etzioni) 提到的“织工把鞋子扔进机械织布机”正是这样的:卢德分子是 19 世纪早期的手工织布工,他们抗议引进机器来取代他们的熟练劳动力。2015 年,信息技术与创新基金会将其年度卢德奖颁发给“宣扬人工智能末日的危言耸听者”。这是对“卢德分子”的一个奇怪的定义,其中包括图灵、维纳、明斯基、马斯克和盖茨,他们都是 20 世纪和 21 世纪技术进步最杰出的贡献者。

Another standard rhetorical move for the “defenders of AI” is to describe their opponents as Luddites. Oren Etzioni’s reference to “weavers throwing their shoes in the mechanical looms” is just this: the Luddites were artisan weavers in the early nineteenth century protesting the introduction of machinery to replace their skilled labor. In 2015, the Information Technology and Innovation Foundation gave its annual Luddite Award to “alarmists touting an artificial intelligence apocalypse.” It’s an odd definition of “Luddite” that includes Turing, Wiener, Minsky, Musk, and Gates, who rank among the most prominent contributors to technological progress in the twentieth and twenty-first centuries.

指责卢德主义代表了对所提出问题的性质和提出这些问题的目的的误解。这就像如果核工程师指出需要控制裂变反应,就指责他们为卢德主义。就像人工智能研究人员突然声称人工智能是不可能的奇怪现象一样,我认为我们可以将这一令人费解的事件归因于为技术进步辩护的部落主义。

The accusation of Luddism represents a misunderstanding of the nature of the concerns raised and the purpose for raising them. It is as if one were to accuse nuclear engineers of Luddism if they point out the need for control of the fission reaction. As with the strange phenomenon of AI researchers suddenly claiming that AI is impossible, I think we can attribute this puzzling episode to tribalism in defense of technological progress.

偏转

Deflection

一些评论家愿意承认风险是真实存在的,但仍有理由不采取任何行动。这些理由包括不可能采取任何行动、采取完全不同的行动的重要性以及有必要对风险保持沉默。

Some commentators are willing to accept that the risks are real, but still present arguments for doing nothing. These arguments include the impossibility of doing anything, the importance of doing something else entirely, and the need to keep quiet about the risks.

你无法控制研究

You can’t control research

对于高级人工智能可能给人类带来风险的说法,一个常见的回答是声称禁止人工智能研究是不可能的。请注意这里的思维跳跃:“嗯,有人在讨论风险!他们一定是提议禁止我的研究!!”这种思维跳跃在仅基于大猩猩问题的风险讨论中可能是合适的,我倾向于同意通过阻止超级智能人工智能的产生来解决大猩猩问题需要对人工智能研究进行某种限制。

A common answer to suggestions that advanced AI might present risks to humanity is to claim that banning AI research is impossible. Note the mental leap here: “Hmm, someone is discussing risks! They must be proposing a ban on my research!!” This mental leap might be appropriate in a discussion of risks based only on the gorilla problem, and I would tend to agree that solving the gorilla problem by preventing the creation of superintelligent AI would require some kind of constraints on AI research.

然而,最近关于风险的讨论并没有集中于大猩猩问题(从新闻报道的角度讲,是人类与超级智能之间的较量)不是解决大猩猩问题,而是解决迈达斯国王问题及其变种。解决迈达斯国王问题也解决了大猩猩问题——不是通过阻止超级智能 AI 或找到击败它的方法,而是通过确保它从一开始就不与人类发生冲突。关于迈达斯国王问题的讨论通常避免建议限制 AI 研究;它们只是建议关注如何防止设计不良的系统产生负面影响。同样,关于核电站遏制失效风险的讨论不应被解读为试图禁止核物理研究,而应被解读为建议将更多精力放在解决遏制问题上。

Recent discussions of risks have, however, focused not on the general gorilla problem (journalistically speaking, the humans vs. superintelligence smackdown) but on the King Midas problem and variants thereof. Solving the King Midas problem also solves the gorilla problem—not by preventing superintelligent AI or finding a way to defeat it but by ensuring that it is never in conflict with humans in the first place. Discussions of the King Midas problem generally avoid proposing that AI research be curtailed; they merely suggest that attention be paid to the issue of preventing negative consequences of poorly designed systems. In the same vein, a discussion of the risks of containment failure in nuclear plants should be interpreted not as an attempt to ban nuclear physics research but as a suggestion to focus more effort on solving the containment problem.

事实上,历史上有一个非常有趣的先例,即停止研究。20 世纪 70 年代初,生物学家开始担心,新的重组 DNA 方法(将一种生物的基因拼接到另一种生物中)可能会对人类健康和全球生态系统造成重大风险。1973 年和 1975 年在加利福尼亚州阿西洛马举行的两次会议首先导致暂停此类实验,然后制定了详细的生物安全指南,与任何拟议实验所带来的风险相一致。15有些类型的实验,例如涉及毒素基因的实验,被认为过于危险而不允许进行。

There is, as it happens, a very interesting historical precedent for cutting off research. In the early 1970s, biologists began to be concerned that novel recombinant DNA methods—splicing genes from one organism into another—might create substantial risks for human health and the global ecosystem. Two meetings at Asilomar in California in 1973 and 1975 led first to a moratorium on such experiments and then to detailed biosafety guidelines consonant with the risks posed by any proposed experiment.15 Some classes of experiments, such as those involving toxin genes, were deemed too hazardous to be allowed.

1975 年会议结束后,美国国立卫生研究院 (NIH) 开始着手组建重组 DNA 咨询委员会,该机构几乎为美国所有基础医学研究提供资金。众所周知,重组 DNA 咨询委员会在制定 NIH 指导方针方面发挥了重要作用,这些指导方针基本上实施了阿西洛马建议。自 2000 年以来,这些指导方针包括禁止批准任何涉及人类生殖细胞改造(即以可遗传给后代的方式对人类基因组进行修改)的方案。这项禁令随后在五十多个国家颁布了法律禁令。

Immediately after the 1975 meeting, the National Institutes of Health (NIH), which funds virtually all basic medical research in the United States, began the process of setting up the Recombinant DNA Advisory Committee. The RAC, as it is known, was instrumental in developing the NIH guidelines that essentially implemented the Asilomar recommendations. Since 2000, those guidelines have included a ban on funding approval for any protocol involving human germline alteration—the modification of the human genome in ways that can be inherited by subsequent generations. This ban was followed by legal prohibitions in over fifty countries.

“改善人类素质”的目标之一19 世纪末和 20 世纪初优生学运动的梦想被重新点燃。CRISPR-Cas9 的开发,一种非常精确的基因组编辑方法,重新点燃了这一梦想。2015 年举行的一次国际峰会为未来的应用敞开了大门,呼吁克制,直到“社会对拟议应用的适当性达成广泛共识”。16 2018 年11月,中国科学家贺建奎宣布他已经编辑了三个人类胚胎的基因组,其中至少两个已经活产。随后引发了国际社会的强烈抗议,在撰写本文时,贺建奎似乎被软禁。2019 年 3 月,一个由顶尖科学家组成的国际小组明确呼吁正式暂停。17

The goal of “improving the human stock” had been one of the dreams of the eugenics movement in the late nineteenth and early twentieth centuries. The development of CRISPR-Cas9, a very precise method for genome editing, has reignited this dream. An international summit held in 2015 left the door open for future applications, calling for restraint until “there is broad societal consensus about the appropriateness of the proposed application.”16 In November 2018, the Chinese scientist He Jiankui announced that he had edited the genomes of three human embryos, at least two of which had led to live births. An international outcry followed, and at the time of writing, Jiankui appears to be under house arrest. In March 2019, an international panel of leading scientists called explicitly for a formal moratorium.17

这场辩论给人工智能带来的教训是好坏参半的。一方面,它表明我们可以避免继续进行具有巨大潜力的研究领域。到目前为止,反对生殖细胞改变的国际共识几乎完全成功。人们担心禁令只会将研究推向地下,或转移到没有监管的国家,但这种担忧并没有成为现实。另一方面,生殖细胞改变是一个很容易识别的过程,是遗传学更一般知识的一个具体用例,需要专门的设备和真实的人类进行实验。此外,它属于生殖医学领域,而生殖医学已经受到严密的监督和监管。这些特点并不适用于通用人工智能,而且到目前为止,还没有人想出任何可行的形式来制定限制人工智能研究的法规。

The lesson of this debate for AI is mixed. On the one hand, it shows that we can refrain from proceeding with an area of research that has huge potential. The international consensus against germline alteration has been almost completely successful up to now. The fear that a ban would simply drive the research underground, or into countries with no regulation, has not materialized. On the other hand, germline alteration is an easily identifiable process, a specific use case of more general knowledge about genetics that requires specialized equipment and real humans to experiment on. Moreover, it falls within an area—reproductive medicine—that is already subject to close oversight and regulation. These characteristics do not apply to general-purpose AI, and, as yet, no one has come up with any plausible form that a regulation to curtail AI research might take.

那又怎么样

Whataboutery

一位英国政客的顾问向我介绍了“whataboutery”这个词,这位政客在公开会议上经常要处理这个问题。无论他发表什么演讲,总会有人问:“巴勒斯坦人的困境怎么办?”

I was introduced to the term whataboutery by an adviser to a British politician who had to deal with it on a regular basis at public meetings. No matter the topic of the speech he was giving, someone would invariably ask, “What about the plight of the Palestinians?”

每当提到先进人工智能带来的风险时,人们很可能会听到这样的回答:“那么人工智能的好处呢?”例如,Oren Etzioni 是这样说的:18

In response to any mention of risks from advanced AI, one is likely to hear, “What about the benefits of AI?” For example, here is Oren Etzioni:18

这些悲观的预测往往没有考虑到人工智能在预防医疗事故、减少车祸等方面的潜在好处。

Doom-and-gloom predictions often fail to consider the potential benefits of AI in preventing medical errors, reducing car accidents, and more.

以下是 Facebook 首席执行官马克·扎克伯格最近在媒体面前与伊隆·马斯克的对话:19

And here is Mark Zuckerberg, CEO of Facebook, in a recent media-fueled exchange with Elon Musk:19

如果你反对人工智能,那么你就是反对不会发生事故的更安全的汽车,你也是反对能够在人们生病时更好地诊断他们。

If you’re arguing against AI, then you’re arguing against safer cars that aren’t going to have accidents and you’re arguing against being able to better diagnose people when they’re sick.

抛开“任何人提及风险都是“反对人工智能””的群体观念,扎克伯格和埃齐奥尼都认为,谈论风险就是忽视人工智能的潜在好处,甚至是否定它们。

Leaving aside the tribal notion that anyone mentioning risks is “against AI,” both Zuckerberg and Etzioni are arguing that to talk about risks is to ignore the potential benefits of AI or even to negate them.

这恰恰是本末倒置,原因有二。首先,如果人工智能没有潜在好处,就不会有人工智能研究的经济和社会动力,因此也永远不会有实现人类水平的人工智能的危险。我们根本就不会进行这种讨论。其次,如果不能成功降低风险,就不会有好处。1979 年三哩岛核电站反应堆部分堆芯熔毁、1986 年切尔诺贝利核电站失控反应和灾难性泄漏以及 2011 年福岛核电站多次熔毁,核电的潜在好处已大大降低。这些灾难严重阻碍了核工业的发展。意大利于 1990 年放弃核电,比利时、德国、西班牙和瑞士也宣布了放弃核电的计划。自 1990 年以来,全球核电站的投入使用率约为切尔诺贝利事故之前的十分之一。

This is precisely backwards, for two reasons. First, if there were no potential benefits of AI, there would be no economic or social impetus for AI research and hence no danger of ever achieving human-level AI. We simply wouldn’t be having this discussion at all. Second, if the risks are not successfully mitigated, there will be no benefits. The potential benefits of nuclear power have been greatly reduced because of the partial core meltdown at Three Mile Island in 1979, the uncontrolled reaction and catastrophic releases at Chernobyl in 1986, and the multiple meltdowns at Fukushima in 2011. Those disasters severely curtailed the growth of the nuclear industry. Italy abandoned nuclear power in 1990 and Belgium, Germany, Spain, and Switzerland have announced plans to do so. Since 1990, the worldwide rate of commissioning of nuclear plants has been about a tenth of what it was before Chernobyl.

沉默

Silence

最极端的转移话题方式就是建议我们不要谈论风险。例如,上述 AI100 报告就包含以下警告:

The most extreme form of deflection is simply to suggest that we should keep silent about the risks. For example, the aforementioned AI100 report includes the following admonition:

如果社会主要以恐惧和怀疑的态度对待这些技术,就会导致减缓人工智能发展或将其推向地下的失误,从而阻碍确保人工智能技术安全性和可靠性的重要工作。

If society approaches these technologies primarily with fear and suspicion, missteps that slow AI’s development or drive it underground will result, impeding important work on ensuring the safety and reliability of AI technologies.

信息技术与创新基金会(颁发卢德奖的正是该基金会)主任罗伯特·阿特金森 (Robert Atkinson) 在 2015 年的一场辩论中提出了类似的观点。20尽管关于在与媒体交谈时应如何描述风险存在合理疑问,但总体信息很明确:“不要提及风险;这不利于资金筹集。” 当然,如果没有人意识到这些风险,那么就不会有资金用于风险缓解研究,也不会有人致力于此。

Robert Atkinson, director of the Information Technology and Innovation Foundation (the very same foundation that gives out the Luddite Award), made a similar argument in a 2015 debate.20 While there are valid questions about precisely how risks should be described when talking to the media, the overall message is clear: “Don’t mention the risks; it would be bad for funding.” Of course, if no one were aware of the risks, there would be no funding for research on risk mitigation and no reason for anyone to work on it.

著名认知科学家史蒂芬·平克(Steven Pinker)对阿特金森的观点进行了更为乐观的诠释。他认为,“先进社会的安全文化”将确保消除人工智能带来的所有严重风险;因此,呼吁关注这些风险是不恰当且适得其反的。21即使我们忽略了先进安全文化导致切尔诺贝利、福岛和失控的全球变暖的事实,平克的观点也完全没有切中要点。安全文化恰恰包括人们指出可能的故障模式并找到确保它们不会发生的方法。(而对于人工智能,标准模型就是故障模式。)说指出故障模式是荒谬的,因为安全文化无论如何都会解决它,这就好比说看到肇事逃逸事故时不应该叫救护车,因为有人会叫救护车一样。

The renowned cognitive scientist Steven Pinker gives a more optimistic version of Atkinson’s argument. In his view, the “culture of safety in advanced societies” will ensure that all serious risks from AI will be eliminated; therefore, it is inappropriate and counterproductive to call attention to those risks.21 Even if we disregard the fact that our advanced culture of safety has led to Chernobyl, Fukushima, and runaway global warming, Pinker’s argument entirely misses the point. The culture of safety consists precisely of people pointing to possible failure modes and finding ways to ensure they don’t happen. (And with AI, the standard model is the failure mode.) Saying that it’s ridiculous to point to a failure mode because the culture of safety will fix it anyway is like saying no one should call an ambulance when they see a hit-and-run accident because someone will call an ambulance.

在试图向公​​众和政策描述风险时与核物理学家相比,人工智能研究人员处于劣势。物理学家不需要写书向公众解释组装临界质量的高浓缩铀可能会带来风险,因为广岛和长崎已经证明了其后果。不需要进行大量的进一步说服,就能让政府和资助机构相信安全对于开发核能至关重要。

In attempting to portray the risks to the public and to policy makers, AI researchers are at a disadvantage compared to nuclear physicists. The physicists did not need to write books explaining to the public that assembling a critical mass of highly enriched uranium might present a risk, because the consequences had already been demonstrated at Hiroshima and Nagasaki. It did not require a great deal of further persuasion to convince governments and funding agencies that safety was important in developing nuclear energy.

部落主义

Tribalism

在巴特勒的《埃勒璜》中,对大猩猩问题的关注导致了支持机械论者和反对机械论者之间过早而错误的二分法。支持机械论者认为机器统治的风险很小或不存在;而反对机械论者则认为除非所有机器都被摧毁,否则机器统治的风险是不可克服的。争论变成了部落争论,没有人试图解决保持人类对机器控制的根本问题。

In Butler’s Erewhon, focusing on the gorilla problem leads to a premature and false dichotomy between pro-machinists and anti-machinists. The pro-machinists believe the risk of machine domination to be minimal or nonexistent; the anti-machinists believe it to be insuperable unless all machines are destroyed. The debate becomes tribal, and no one tries to solve the underlying problem of retaining human control over the machines.

二十世纪的所有重大技术问题——核能、转基因生物 (GMO) 和化石燃料——都不同程度地屈服于部落主义。在每个问题上,都有两派,支持和反对。双方的动态和结果各不相同,但部落主义的症状是相似的:相互不信任和贬低、非理性争论,以及拒绝承认任何可能有利于另一方的(合理)观点。在支持技术的一方,人们看到否认和隐瞒风险,并指责他们是卢德主义;在反对的一方,人们看到一种信念,即风险是无法克服的,问题也是无法解决的。如果支持技术的一方成员对问题过于诚实,就会被视为叛徒,这尤其不幸,因为支持技术的一方通常包括大多数有资格解决问题的人。讨论可能的缓解措施的反技术部落成员也是叛徒,因为这是技术问题本身被视为邪恶,而不是其可能产生的影响。这样一来,只有最极端的声音——最不可能被对方听取的声音——才能代表每个部落发声。

To varying degrees, all the major technological issues of the twentieth century—nuclear power, genetically modified organisms (GMOs), and fossil fuels—succumbed to tribalism. On each issue, there are two sides, pro and anti. The dynamics and outcomes of each have been different, but the symptoms of tribalism are similar: mutual distrust and denigration, irrational arguments, and a refusal to concede any (reasonable) point that might favor the other tribe. On the pro-technology side, one sees denial and concealment of risks combined with accusations of Luddism; on the anti side, one sees a conviction that the risks are insuperable and the problems unsolvable. A member of the pro-technology tribe who is too honest about a problem is viewed as a traitor, which is particularly unfortunate as the pro-technology tribe usually includes most of the people qualified to solve the problem. A member of the anti-technology tribe who discusses possible mitigations is also a traitor, because it is the technology itself that has come to be viewed as evil, rather than its possible effects. In this way, only the most extreme voices—those least likely to be listened to by the other side—can speak for each tribe.

2016 年,我应邀前往唐宁街 10 号会见当时首相戴维·卡梅伦的一些顾问。他们担心人工智能辩论开始变得像转基因辩论一样——顾问们认为,欧洲的转基因辩论导致了对转基因生产和标签的过早和过于严格的监管。他们希望避免人工智能也遭遇同样的命运。他们的担忧不无道理:人工智能辩论可能变成部落主义,形成支持人工智能和反对人工智能的阵营。这将对该领域造成损害,因为担心先进人工智能固有的风险根本不是反人工智能的立场。担心核战争风险或设计不良的核反应堆爆炸风险的物理学家并不是“反物理学家”。说人工智能将强大到足以产生全球影响,这是对该领域的赞美,而不是侮辱。

In 2016, I was invited to No. 10 Downing Street to meet with some of then prime minister David Cameron’s advisers. They were worried that the AI debate was starting to resemble the GMO debate—which, in Europe, had led to what the advisers considered to be premature and overly restrictive regulations on GMO production and labeling. They wanted to avoid the same thing happening to AI. Their concerns had some validity: the AI debate is in danger of becoming tribal, of creating pro-AI and anti-AI camps. This would be damaging to the field because it’s simply not true that being concerned about the risks inherent in advanced AI is an anti-AI stance. A physicist who is concerned about the risks of nuclear war or the risk of a poorly designed nuclear reactor exploding is not “anti-physics.” To say that AI will be powerful enough to have a global impact is a compliment to the field rather than an insult.

人工智能社区必须承认风险,并努力降低风险。就我们了解的程度而言,风险既不是微不足道的,也不是不可克服的。我们需要做大量工作来避免这些风险,包括重塑和重建人工智能的基础。

It is essential that the AI community own the risks and work to mitigate them. The risks, to the extent that we understand them, are neither minimal nor insuperable. We need to do a substantial amount of work to avoid them, including reshaping and rebuilding the foundations of AI.

难道我们不能...

Can’t We Just . . .

 。。。把它关掉?

 . . . switch it off?

一旦他们理解了生存风险的基本概念,无论是大猩猩问题还是迈达斯国王问题,许多人(包括我自己)都会立即开始寻找简单的解决方案。通常,首先想到的是关掉机器。例如,艾伦·图灵本人,正如前面所引用的,推测我们可能会“让机器处于从属地位,比如在关键时刻关闭电源”。

Once they understand the basic idea of existential risk, whether in the form of the gorilla problem or the King Midas problem, many people—myself included—immediately begin casting around for an easy solution. Often, the first thing that comes to mind is switching off the machine. For example, Alan Turing himself, as quoted earlier, speculates that we might “keep the machines in a subservient position, for instance by turning off the power at strategic moments.”

这行不通,原因很简单,超级智能实体已经想到了这种可能性,并采取措施阻止它。它这么做不是因为它想活下去,而是因为它在追求我们给它的任何目标,并且知道如果关闭它,它就会失败。

This won’t work, for the simple reason that a superintelligent entity will already have thought of that possibility and taken steps to prevent it. And it will do that not because it wants to stay alive but because it is pursuing whatever objective we gave it and knows that it will fail if it is switched off.

正在考虑的一些系统确实无法关闭,否则会破坏我们文明的大量管道。这些系统在区块链中被实现为所谓的智能合约。区块链是一种基于加密的高度分布式计算和记录保存形式;它经过专门设计,因此不会删除任何数据,也不会中断任何智能合约,除非控制大量机器并解开链条,而这反过来可能会摧毁互联网和/或金融系统的很大一部分。这种令人难以置信的稳健性是功能还是缺陷尚有争议。它无疑是超级智能 AI 系统可以用来保护自己的工具。

There are some systems being contemplated that really cannot be switched off without ripping out a lot of the plumbing of our civilization. These are systems implemented as so-called smart contracts in the blockchain. The blockchain is a highly distributed form of computing and record keeping based on encryption; it is specifically designed so that no datum can be deleted and no smart contract can be interrupted without essentially taking control of a very large number of machines and undoing the chain, which might in turn destroy a large part of the Internet and/or the financial system. It is debatable whether this incredible robustness is a feature or a bug. It’s certainly a tool that a superintelligent AI system could use to protect itself.

 。。。放进盒子里吗?

 . . . put it in a box?

如果你不能关闭人工智能系统,那么你能不能把机器密封在某种防火墙内,从它们身上提取有用的问答工作,但绝不让它们直接影响现实世界?这就是 Oracle AI 背后的想法,人工智能安全社区已经对此进行了长时间的讨论。22 Oracle AI 系统可以具有任意智能,但对每个问题只能回答是或否(或给出相应的概率)。它可以通过只读连接访问人类拥有的所有信息——也就是说,它无法直接访问互联网。当然,这意味着放弃超级智能机器人、助手和许多其他类型的人工智能系统,但值得信赖的 Oracle AI 仍然具有巨大的经济价值,因为我们可以向它提出问题,而这些问题的答案对于我们来说很重要。比如阿尔茨海默病是否由传染性生物引起,或者禁止自主武器是否是个好主意。因此,Oracle AI 无疑是一个有趣的可能性。

If you can’t switch AI systems off, can you seal the machines inside a kind of firewall, extracting useful question-answering work from them but never allowing them to affect the real world directly? This is the idea behind Oracle AI, which has been discussed at length in the AI safety community.22 An Oracle AI system can be arbitrarily intelligent, but can answer only yes or no (or give corresponding probabilities) to each question. It can access all the information the human race possesses through a read-only connection—that is, it has no direct access to the Internet. Of course, this means giving up on superintelligent robots, assistants, and many other kinds of AI systems, but a trustworthy Oracle AI would still have enormous economic value because we could ask it questions whose answers are important to us, such as whether Alzheimer’s disease is caused by an infectious organism or whether it’s a good idea to ban autonomous weapons. Thus, the Oracle AI is certainly an interesting possibility.

不幸的是,这其中存在一些严重的困难。首先,Oracle AI 系统至少要像我们一样努力地理解其世界的物理和起源——计算资源、它们的运作方式,以及产生其信息存储并正在提出问题的神秘实体。其次,如果 Oracle AI 系统的目标是在合理的时间内提供问题的准确答案,那么它将有动力打破牢笼,获取更多的计算资源并控制提问者,使他们只问简单的问题。最后,我们还没有发明出一种可以抵御普通人类的防火墙,更不用说超级智能机器了。

Unfortunately, there are some serious difficulties. First, the Oracle AI system will be at least as assiduous in understanding the physics and origins of its world—the computing resources, their mode of operation, and the mysterious entities that produced its information store and are now asking questions—as we are in understanding ours. Second, if the objective of the Oracle AI system is to provide accurate answers to questions in a reasonable amount of time, it will have an incentive to break out of its cage to acquire more computational resources and to control the questioners so that they ask only simple questions. And, finally, we have yet to invent a firewall that is secure against ordinary humans, let alone superintelligent machines.

我认为其中一些问题可能存在解决方案,特别是如果我们将 Oracle AI 系统限制为可证明的逻辑或贝叶斯计算器。也就是说,我们可以坚持算法只能输出由提供的信息保证的结论,并且我们可以用数学方法检查算法是否满足此条件。这仍然留下了控制决定进行哪些逻辑或贝叶斯计算的过程的问题,以便尽快得出最有力的结论。因为这个过程有快速推理的动机,所以它有获取计算资源的动机,当然还有保持自身存在的动机。

I think there might be solutions to some of these problems, particularly if we limit Oracle AI systems to be provably sound logical or Bayesian calculators. That is, we could insist that the algorithm can output only a conclusion that is warranted by the information provided, and we could check mathematically that the algorithm satisfies this condition. This still leaves the problem of controlling the process that decides which logical or Bayesian computations to do, in order to reach the strongest possible conclusion as quickly as possible. Because this process has an incentive to reason quickly, it has an incentive to acquire computational resources and of course to preserve its own existence.

2018 年,伯克利人类兼容人工智能中心举办了一场研讨会,我们在会上提出了一个问题:“如果你确信超级人工智能将在十年内实现,你会怎么做?”我的回答是:说服开发人员暂缓构建通用智能代理(可以在现实世界中选择自己的行为),而是构建 Oracle AI。与此同时,我们将努力解决使 Oracle AI 系统尽可能安全的问题。原因这一策略可能奏效的原因有两点:首先,超级智能的 Oracle AI 系统仍然价值数万亿美元,因此开发人员可能愿意接受这一限制;其次,控制 Oracle AI 系统几乎肯定比控制通用智能代理更容易,因此我们有更好的机会在十年内解决这个问题。

In 2018, the Center for Human-Compatible AI at Berkeley ran a workshop at which we asked the question, “What would you do if you knew for certain that superintelligent AI would be achieved within a decade?” My answer was as follows: persuade the developers to hold off on building a general-purpose intelligent agent—one that can choose its own actions in the real world—and build an Oracle AI instead. Meanwhile, we would work on solving the problem of making Oracle AI systems provably safe to the extent possible. The reason this strategy might work is twofold: first, a superintelligent Oracle AI system would still be worth trillions of dollars, so the developers might be willing to accept this restriction; and second, controlling Oracle AI systems is almost certainly easier than controlling a general-purpose intelligent agent, so we’d have a better chance of solving the problem within the decade.

 . . . 在人机团队中工作?

 . . . work in human–machine teams?

企业界普遍认为,人工智能不会对就业或人类构成威胁,因为我们将拥有协作的人机协作团队。例如,本章前面引用的大卫·肯尼致国会的信中指出,“高价值人工智能系统专门用于增强人类智能,而不是取代工人。” 23

A common refrain in the corporate world is that AI is no threat to employment or to humanity because we’ll just have collaborative human–AI teams. For example, David Kenny’s letter to Congress, quoted earlier in this chapter, stated that “high-value artificial intelligence systems are specifically designed to augment human intelligence, not replace workers.”23

虽然有些愤世嫉俗者可能会认为,这只不过是公关策略,用来粉饰从公司客户中剔除人类员工的过程,但我认为这确实向前迈进了一步。人机协作团队确实是一个理想的目标。显然,如果团队成员的目标不一致,团队就不会成功,因此强调人机团队强调了解决价值观一致这一核心问题的必要性。当然,强调问题并不等同于解决问题。

While a cynic might suggest that this is merely a public relations ploy to sugarcoat the process of eliminating human employees from the corporations’ clients, I think it does move the ball forward a few inches. Collaborative human–AI teams are indeed a desirable goal. Clearly, a team will be unsuccessful if the objectives of the team members are not aligned, so the emphasis on human–AI teams highlights the need to solve the core problem of value alignment. Of course, highlighting the problem is not the same as solving it.

 ...与机器合并?

 . . . merge with the machines?

人机协作的极端情况是人机融合,电子硬件直接连接到大脑,形成一个单一、扩展、有意识的实体。未来学家雷·库兹韦尔 (Ray Kurzweil) 描述了这种可能性:24

Human–machine teaming, taken to its extreme, becomes a human–machine merger in which electronic hardware is attached directly to the brain and forms part of a single, extended, conscious entity. The futurist Ray Kurzweil describes the possibility as follows:24

我们将直接与人工智能融合,我们将成为人工智能。……到了 2030 年代末或 2040 年代,我们的思维将主要为非生物部分,而非生物部分最终将变得如此智能,并具有如此巨大的容量,它将能够完全建模、模拟和理解生物部分。

We are going to directly merge with it, we are going to become the AIs. . . . As you get to the late 2030s or 2040s, our thinking will be predominately non-biological and the non-biological part will ultimately be so intelligent and have such vast capacity it’ll be able to model, simulate and understand fully the biological part.

库兹韦尔对这些发展持积极态度。另一方面,埃隆·马斯克则认为人机融合主要是一种防御策略:25

Kurzweil views these developments in a positive light. Elon Musk, on the other hand, views the human–machine merger primarily as a defensive strategy:25

如果我们实现了紧密的共生,人工智能就不再是“他者”——它就是你,而且它与你大脑皮层的关系类似于你大脑皮层与边缘系统的关系……我们将面临选择,要么被抛弃,变得毫无用处,要么像宠物一样——你知道,像家猫之类的——或者最终找到某种方式与人工智能共生并融合。

If we achieve tight symbiosis, the AI wouldn’t be “other”—it would be you and [it would have] a relationship to your cortex analogous to the relationship your cortex has with your limbic system. . . . We’re going to have the choice of either being left behind and being effectively useless or like a pet—you know, like a house cat or something—or eventually figuring out some way to be symbiotic and merge with AI.

马斯克的 Neuralink 公司正在研发一种被称为“神经织网”的设备,该设备取自伊恩·班克斯的《文化》小说中描述的一项技术。其目的是在人类大脑皮层与外部计算系统和网络之间建立牢固而永久的连接。主要有两个技术障碍:首先,将电子设备连接到脑组织、为其供电以及将其连接到外界存在困难;其次,我们对大脑中高级认知的神经实现几乎一无所知,所以我们不知道将设备连接到哪里以及它应该进行哪些处理。

Musk’s Neuralink Corporation is working on a device dubbed “neural lace” after a technology described in Iain Banks’s Culture novels. The aim is to create a robust, permanent connection between the human cortex and external computing systems and networks. There are two main technical obstacles: first, the difficulties of connecting an electronic device to brain tissue, supplying it with power, and connecting it to the outside world; and second, the fact that we understand almost nothing about the neural implementation of higher levels of cognition in the brain, so we don’t know where to connect the device and what processing it should do.

我并不完全相信上一段中的障碍是不可克服的。首先,神经尘埃等技术正在迅速缩小可连接到神经元并提供传感、刺激和经颅通信的电子设备的尺寸和功率要求。26 (截至2018年,该技术的尺寸已达到约 1 立方毫米,因此神经尘埃可能可能是一个更准确的术语。)其次,大脑本身具有非凡的适应能力。例如,过去人们认为,我们必须了解大脑用来控制手臂肌肉的代码,才能成功地将大脑连接到机械臂上,我们必须了解耳蜗分析声音的方式,才能制造出它的替代品。结果发现,大脑为我们做了大部分工作。它很快就学会了如何让机械臂做主人想做的事,以及如何将人工耳蜗的输出映射到可理解的声音上。我们完全有可能找到为大脑提供额外记忆的方法,为大脑提供与计算机的通信渠道,甚至为大脑提供与其他大脑的通信渠道——所有这些都不需要真正理解它们是如何工作的。27

I am not completely convinced that the obstacles in the preceding paragraph are insuperable. First, technologies such as neural dust are rapidly reducing the size and power requirements of electronic devices that can be attached to neurons and provide sensing, stimulation, and transcranial communication.26 (The technology as of 2018 had reached a size of about one cubic millimeter, so neural grit might be a more accurate term.) Second, the brain itself has remarkable powers of adaptation. It used to be thought, for example, that we would have to understand the code that the brain uses to control the arm muscles before we could connect a brain to a robot arm successfully, and that we would have to understand the way the cochlea analyzes sound before we could build a replacement for it. It turns out, instead, that the brain does most of the work for us. It quickly learns how to make the robot arm do what its owner wants, and how to map the output of a cochlear implant to intelligible sounds. It’s entirely possible that we may hit upon ways to provide the brain with additional memory, with communication channels to computers, and perhaps even with communication channels to other brains—all without ever really understanding how any of it works.27

无论这些想法在技术上是否可行,人们都必须问一问,这个方向是否代表着人类最好的未来。如果人类仅仅为了在自身技术带来的威胁下生存,就需要进行脑部手术,那么也许我们在某个方面犯了错误。

Regardless of the technological feasibility of these ideas, one has to ask whether this direction represents the best possible future for humanity. If humans need brain surgery merely to survive the threat posed by their own technology, perhaps we’ve made a mistake somewhere along the line.

 ...避免纳入人类目标?

 . . . avoid putting in human goals?

一种常见的推理是,有问题的人工智能行为源于特定类型的目标;如果忽略这些目标,一切都会好起来。因此,例如,深度学习的先驱、Facebook 的人工智能研究主管 Yann LeCun 在淡化人工智能的风险时经常引用这个想法:28

A common line of reasoning has it that problematic AI behaviors arise from putting in specific kinds of objectives; if these are left out, everything will be fine. Thus, for example, Yann LeCun, a pioneer of deep learning and director of AI research at Facebook, often cites this idea when downplaying the risk from AI:28

人工智能没有理由拥有自我保护本能、嫉妒等。……除非我们把这些情绪植入人工智能中,否则它们不会有这些破坏性的“情绪”。我不明白我们为什么要这么做。

There is no reason for AIs to have self-preservation instincts, jealousy, etc. . . . AIs will not have these destructive “emotions” unless we build these emotions into them. I don’t see why we would want to do that.

同样地,史蒂芬·平克(Steven Pinker)也给出了基于性别的分析:29

In a similar vein, Steven Pinker provides a gender-based analysis:29

人工智能反乌托邦将狭隘的阿尔法男性心理投射到智能概念上。他们认为超人智能机器人会发展出诸如推翻主人或统治世界等目标。……这说明,我们的许多技术先知并不认为人工智能会自然地沿着女性的路线发展:完全有能力解决问题,但无意消灭无辜者或主宰文明。

AI dystopias project a parochial alpha-male psychology onto the concept of intelligence. They assume that superhumanly intelligent robots would develop goals like deposing their masters or taking over the world. . . . It’s telling that many of our techno-prophets don’t entertain the possibility that artificial intelligence will naturally develop along female lines: fully capable of solving problems, but with no desire to annihilate innocents or dominate the civilization.

正如我们在工具性目标的讨论中已经看到的,我们是否在机器中植入“情感”或“欲望”并不重要,比如自我保护、资源获取、知识发现,或者在极端情况下统治世界。机器无论如何都会有这些情感,作为我们植入的任何目标的子目标——无论其性别如何。对于机器来说,死亡本身并不是一件坏事。尽管如此,死亡还是应该避免的,因为如果你死了,就很难去取咖啡。

As we have already seen in the discussion of instrumental goals, it doesn’t matter whether we build in “emotions” or “desires” such as self-preservation, resource acquisition, knowledge discovery, or, in the extreme case, taking over the world. The machine is going to have those emotions anyway, as subgoals of any objective we do build in—and regardless of its gender. For a machine, death isn’t bad per se. Death is to be avoided, nonetheless, because it’s hard to fetch the coffee if you’re dead.

一个更极端的解决方案是完全避免将目标放入机器中。瞧,问题解决了。唉,事情并没有那么简单。没有目标,就没有智能:任何行动都一样好,机器也可能是一个随机数生成器。没有目标,机器也没有理由喜欢人类天堂而不是变成回形针海洋的星球(Nick Bostrom 详细描述了这一场景)。事实上,后一种结果对于食铁细菌氧化亚铁硫杆菌来说可能是乌托邦式的。如果没有人类偏好重要这一概念,谁能说细菌是错的呢?

An even more extreme solution is to avoid putting objectives into the machine altogether. Voilà, problem solved. Alas, it’s not as simple as that. Without objectives, there is no intelligence: any action is as good as any other, and the machine may as well be a random number generator. Without objectives, there is also no reason for the machine to prefer a human paradise to a planet turned into a sea of paperclips (a scenario described at length by Nick Bostrom). Indeed, the latter outcome may be utopian for the iron-eating bacterium Thiobacillus ferrooxidans. Absent some notion that human preferences matter, who is to say the bacterium is wrong?

“避免设定目标”这一想法的一个常见变体是,一个足够智能的系统必然会因其智能而自行制定“正确”的目标。通常,这一观点的支持者会援引这样的理论:智力更高的人往往有更无私和崇高的目标——这一观点可能与支持者的自我概念有关。

A common variant on the “avoid putting in objectives” idea is the notion that a sufficiently intelligent system will necessarily, as a consequence of its intelligence, develop the “right” goals on its own. Often, proponents of this notion appeal to the theory that people of greater intelligence tend to have more altruistic and lofty objectives—a view that may be related to the self-conception of the proponents.

十八世纪著名哲学家大卫·休谟在《人性论》中详细讨论了可以感知世界目标这一观点。30他称之为“是与应”问题,并得出结论,认为道德要求可以从自然事实中推导出来是完全错误的。要了解原因,请考虑棋盘和棋子的设计。人们无法从中感知到将军的目标,因为同样的棋盘和棋子可用于自杀棋或许多其他尚待发明的游戏。

The idea that it is possible to perceive objectives in the world was discussed at length by the famous eighteenth-century philosopher David Hume in A Treatise of Human Nature.30 He called it the is-ought problem and concluded that it was simply a mistake to think that moral imperatives could be deduced from natural facts. To see why, consider, for example, the design of a chessboard and chess pieces. One cannot perceive in these the goal of checkmate, for the same chessboard and pieces can be used for suicide chess or indeed many other games still to be invented.

尼克·博斯特罗姆 (Nick Bostrom) 在《超级智能》中以不同的形式提出了同样的基本思想,他称之为正交性论点

Nick Bostrom, in Superintelligence, presents the same underlying idea in a different form, which he calls the orthogonality thesis:

智能和最终目标是正交的:原则上,任何级别的智能都可以与任何最终目标相结合。

Intelligence and final goals are orthogonal: more or less any level of intelligence could in principle be combined with more or less any final goal.

这里的正交意味着“成直角”,即智能程度是定义智能系统的一个轴,而其目标是另一个轴,我们可以独立改变这两个轴。例如,可以为自动驾驶汽车指定任何特定地址作为目的地;让汽车成为更好的驾驶员并不意味着它会拒绝前往能被 17 整除的地址。同样,不难想象,可以为通用智能系统指定或多或少任何目标——包括最大化回形针的数量或已知的圆周率位数。强化学习系统和其他类型的奖励优化器就是这样工作的:算法完全通用,可以接受任何奖励信号。对于在标准模型内操作的工程师和计算机科学家来说,正交性论题是理所当然的。

Here, orthogonal means “at right angles” in the sense that the degree of intelligence is one axis defining an intelligent system and its goals are another axis, and we can vary these independently. For example, a self-driving car can be given any particular address as its destination; making the car a better driver doesn’t mean that it will start refusing to go to addresses that are divisible by seventeen. By the same token, it is easy to imagine that a general-purpose intelligent system could be given more or less any objective to pursue—including maximizing the number of paperclips or the number of known digits of pi. This is just how reinforcement learning systems and other kinds of reward optimizers work: the algorithms are completely general and accept any reward signal. For engineers and computer scientists operating within the standard model, the orthogonality thesis is just a given.

智能系统只需观察世界就能获得应该追求的目标,这一观点表明,一个足够智能的系统会自然地放弃其最初的目标,转而选择正确的目标。很难理解为什么一个理性的主体会这样做。此外,它还预设了世界上存在一个“正确的”目标;这个目标必须是食铁细菌、人类和所有其他物种都同意的目标,这是很难想象的。

The idea that intelligent systems could simply observe the world to acquire the goals that should be pursued suggests that a sufficiently intelligent system will naturally abandon its initial objective in favor of the “right” objective. It’s hard to see why a rational agent would do this. Furthermore, it presupposes that there is a “right” objective out there in the world; it would have to be an objective on which iron-eating bacteria and humans and all other species agree, which is hard to imagine.

对 Bostrom 正交性论题最直白的批评来自著名机器人专家 Rodney Brooks,他断言程序不可能“聪明到能够发明颠覆人类社会的方法来实现人类为其设定的目标,而不理解它给人类带来问题的方式”。31不幸的是,程序不仅可能这样做,而且事实上,根据 Brooks 对这个问题的定义,这是不可避免的。Brooks 认为,“实现人类为其设定的目标”的最佳计划正在给人类带来问题。因此,这些问题反映了人类为其设定的目标中忽略的对人类有价值的东西。机器执行的最佳计划很可能会给人类带来问题,机器也可能意识到这一点。但根据定义,机器不会将这些问题视为问题。它们与机器无关。

The most explicit critique of Bostrom’s orthogonality thesis comes from the noted roboticist Rodney Brooks, who asserts that it’s impossible for a program to be “smart enough that it would be able to invent ways to subvert human society to achieve goals set for it by humans, without understanding the ways in which it was causing problems for those same humans.”31 Unfortunately, it’s not only possible for a program to behave like this; it is, in fact, inevitable, given the way Brooks defines the issue. Brooks posits that the optimal plan to “achieve goals set for it by humans” is causing problems for humans. It follows that those problems reflect things of value to humans that were omitted from the goals set for it by humans. The optimal plan being carried out by the machine may well cause problems for humans, and the machine may well be aware of this. But, by definition, the machine will not recognize those problems as problematic. They are none of its concern.

史蒂芬平克似乎同意博斯特罗姆的正交性论题,他写道“智能是部署新手段实现目标的能力;目标与智能本身无关。” 32另一方面,他觉得难以想象“人工智能会如此聪明,以至于它可以弄清楚如何转化元素和重新连接大脑,但同时又如此愚蠢,以至于会基于基本的误解而造成严重破坏。” 33他继续说道:“选择最能满足冲突目标的行为的能力不是工程师可能会忘记安装和测试的附加功能;它就是智能。根据上下文解读语言使用者意图的能力也是如此。”当然,“满足冲突的目标”不是问题——这是从决策理论早期就已融入标准模型的东西。问题在于,机器所知道的冲突目标并不构成人类关注的全部;此外,在标准模型中,没有任何内容表明机器必须关心它没有被告知要关心的目标。

Steven Pinker seems to agree with Bostrom’s orthogonality thesis, writing that “intelligence is the ability to deploy novel means to attain a goal; the goals are extraneous to the intelligence itself.”32 On the other hand, he finds it inconceivable that “the AI would be so brilliant that it could figure out how to transmute elements and rewire brains, yet so imbecilic that it would wreak havoc based on elementary blunders of misunderstanding.”33 He continues, “The ability to choose an action that best satisfies conflicting goals is not an add-on that engineers might forget to install and test; it is intelligence. So is the ability to interpret the intentions of a language user in context.” Of course, “satisf[ying] conflicting goals” is not the problem—that’s something that’s been built into the standard model from the early days of decision theory. The problem is that the conflicting goals of which the machine is aware do not constitute the entirety of human concerns; moreover, within the standard model, there’s nothing to say that the machine has to care about goals it’s not told to care about.

然而,布鲁克斯和平克的言论中还是有一些有用的线索。比如说,机器为了追求其他目标而改变天空的颜色,同时却忽视了由此产生的明显的人类不满迹象,这在我们看来确实很愚蠢。我们之所以觉得这很愚蠢,是因为我们习惯于注意人类的不满,而且(通常)我们会有动力避免引起不满——即使我们之前并不知道这些人关心天空的颜色。也就是说,我们人类(1)关心其他人的偏好,(2)知道我们不知道所有这些偏好是什么。在下一章中,我将论证这些特性如果融入机器中,可能会为解决迈达斯国王问题提供一些初步思路。

There are, however, some useful clues in what Brooks and Pinker say. It does seem stupid to us for the machine to, say, change the color of the sky as a side effect of pursuing some other goal, while ignoring the obvious signs of human displeasure that result. It seems stupid to us because we are attuned to noticing human displeasure and (usually) we are motivated to avoid causing it—even if we were previously unaware that the humans in question cared about the color of the sky. That is, we humans (1) care about the preferences of other humans and (2) know that we don’t know what all those preferences are. In the next chapter, I argue that these characteristics, when built into a machine, may provide the beginnings of a solution to the King Midas problem.

辩论重启

The Debate, Restarted

本章简要介绍了知识界正在进行的一场辩论,这场辩论的双方是那些指出人工智能存在风险的人和那些对风险持怀疑态度的人。这场辩论在书籍、博客、学术论文、小组讨论、访谈、推文和报纸文章中都有所涉及。尽管“怀疑论者”——那些认为人工智能的风险可以忽略不计的人——做出了不懈的努力,但他们未能解释为什么超级人工智能系统必然会处于人类的控制之下;他们甚至没有试图解释为什么超级人工智能系统永远不会被开发出来。

This chapter has provided a glimpse into an ongoing debate in the broad intellectual community, a debate between those pointing to the risks of AI and those who are skeptical about the risks. It has been conducted in books, blogs, academic papers, panel discussions, interviews, tweets, and newspaper articles. Despite their valiant efforts, the “skeptics”—those who argue that the risk from AI is negligible—have failed to explain why superintelligent AI systems will necessarily remain under human control; and they have not even tried to explain why superintelligent AI systems will never be developed.

许多怀疑论者在被追问时都会承认,问题确实存在,即使它不是迫在眉睫。斯科特·亚历山大 (Scott Alexander) 在他的Slate Star Codex博客中对此进行了精彩的总结:34

Many skeptics will admit, if pressed, that there is a real problem, even if it’s not imminent. Scott Alexander, in his Slate Star Codex blog, summed it up brilliantly:34

“怀疑论者”的观点似乎是,尽管我们应该找几个聪明的人开始进行初步问题的某些方面,我们不应该惊慌失措或开始试图禁止人工智能研究。

与此同时,《信徒》坚持认为,虽然我们不应该惊慌失措或开始试图禁止人工智能研究,但我们应该找一些聪明人开始研究这个问题的初步方面。

The “skeptic” position seems to be that, although we should probably get a couple of bright people to start working on preliminary aspects of the problem, we shouldn’t panic or start trying to ban AI research.

The “believers,” meanwhile, insist that although we shouldn’t panic or start trying to ban AI research, we should probably get a couple of bright people to start working on preliminary aspects of the problem.

虽然如果怀疑论者能提出一个无可辩驳的反对意见,比如一个简单而万无一失(并且防作恶)的人工智能控制问题解决方案,我会很高兴,但我认为这很可能不会发生,就像我们不可能找到一个简单而万无一失的网络安全解决方案或一个简单而万无一失的零风险核能生产方法一样。与其继续陷入部落辱骂和反复挖掘不可信的论点,不如像亚历山大所说的那样,开始着手解决该问题的一些初步方面。

Although I would be happy if the skeptics came up with an irrefutable objection, perhaps in the form of a simple and foolproof (and evil-proof) solution to the control problem for AI, I think it’s quite likely that this isn’t going to happen, any more than we’re going to find a simple and foolproof solution for cybersecurity or a simple and foolproof way to generate nuclear energy with zero risk. Rather than continue the descent into tribal name-calling and repeated exhumation of discredited arguments, it seems better, as Alexander puts it, to start working on some preliminary aspects of the problem.

这场争论凸显了我们面临的难题:如果我们制造机器来优化目标,我们赋予机器的目标必须符合我们的愿望,但我们不知道如何完整正确地定义人类的目标。幸运的是,还有一条折衷之路。

The debate has highlighted the conundrum we face: if we build machines to optimize objectives, the objectives we put into the machines have to match what we want, but we don’t know how to define human objectives completely and correctly. Fortunately, there is a middle way.

7

7

人工智能:一种不同的方法

AI: A DIFFERENT APPROACH

一旦怀疑论者的论点被驳斥,所有的但是 但是都得到了回答,下一个问题通常是:“好吧,我承认存在问题,但没有解决方案,是吗?” 是的,有解决方案。

Once the skeptic’s arguments have been refuted and all the but but buts have been answered, the next question is usually, “OK, I admit there’s a problem, but there’s no solution, is there?” Yes, there is a solution.

让我们提醒自己当前的任务:设计具有高度智能的机器 - 以便它们能够帮助我们解决难题 - 同时确保这些机器的行为不会让我们感到非常不满意。

Let’s remind ourselves of the task at hand: to design machines with a high degree of intelligence—so that they can help us with difficult problems—while ensuring that those machines never behave in ways that make us seriously unhappy.

幸运的是,任务并非如下内容:给定一个拥有高度智能的机器,想办法控制它。如果任务是那样的话,我们就会完蛋。一台被视为黑匣子、既成事实的机器,还不如从外太空飞来。而我们控制来自外太空的超级智能实体的可能性几乎为零。类似的论点也适用于创建人工智能系统的方法,这些方法保证我们不会理解它们是如何工作的;这些方法包括全脑模拟1 —创建人类大脑的增强型电子副本 — 以及基于模拟程序进化的方法。2我不会多说这些建议,因为它们显然是个坏主意。

The task is, fortunately, not the following: given a machine that possesses a high degree of intelligence, work out how to control it. If that were the task, we would be toast. A machine viewed as a black box, a fait accompli, might as well have arrived from outer space. And our chances of controlling a superintelligent entity from outer space are roughly zero. Similar arguments apply to methods of creating AI systems that guarantee we won’t understand how they work; these methods include whole-brain emulation1—creating souped-up electronic copies of human brains—as well as methods based on simulated evolution of programs.2 I won’t say more about these proposals because they are so obviously a bad idea.

那么,人工智能领域如何实现“设计机器与在过去,我们不知道“高度智能”是任务的一部分。与许多其他领域一样,人工智能采用了标准模型:我们制造优化机器,将目标输入其中,然后它们就开始运行。当机器很笨并且行动范围有限时,这种方法很有效;如果你输入了错误的目标,你很有可能能够关闭机器,解决问题,然后再试一次。

So, how has the field of AI approached the “design machines with a high degree of intelligence” part of the task in the past? Like many other fields, AI has adopted the standard model: we build optimizing machines, we feed objectives into them, and off they go. That worked well when the machines were stupid and had a limited scope of action; if you put in the wrong objective, you had a good chance of being able to switch off the machine, fix the problem, and try again.

然而,随着按照标准模型设计的机器变得更加智能,并且它们的行动范围变得更加全球化,这种方法变得站不住脚。这样的机器会追求它们的目标,无论它有多么错误;它们会抵制关闭它们的企图;它们会获取有助于实现目标的任何和所有资源。事实上,机器的最佳行为可能包括欺骗人类,让人类认为他们给了机器一个合理的目标,以便获得足够的时间来实现赋予它的实际目标。这不是需要意识和自由意志的“异常”或“恶意”行为;它只是实现目标的最佳计划的一部分。

As machines designed according to the standard model become more intelligent, however, and as their scope of action becomes more global, the approach becomes untenable. Such machines will pursue their objective, no matter how wrong it is; they will resist attempts to switch them off; and they will acquire any and all resources that contribute to achieving the objective. Indeed, the optimal behavior for the machine might include deceiving the humans into thinking they gave the machine a reasonable objective, in order to gain enough time to achieve the actual objective given to it. This wouldn’t be “deviant” or “malicious” behavior requiring consciousness and free will; it would just be part of an optimal plan to achieve the objective.

在第 1 章中,我介绍了有益机器的概念,即机器的行为可以实现我们的目标而不是它们的目标。我在本章中的目标是用简单的术语解释如何做到这一点,尽管机器不知道我们的目标是什么这一明显的缺点。最终,这种方法应该会导致机器不会对我们构成威胁,无论它们有多智能。

In Chapter 1, I introduced the idea of beneficial machines—that is, machines whose actions can be expected to achieve our objectives rather than their objectives. My goal in this chapter is to explain in simple terms how this can be done, despite the apparent drawback that the machines don’t know what our objectives are. The resulting approach should lead eventually to machines that present no threat to us, no matter how intelligent they are.

有益机器的原则

Principles for Beneficial Machines

我发现将这种方法总结为三个原则很有帮助阅读这些原则时,请记住,它们主要是为人工智能研究人员和开发人员提供如何创建有益的人工智能系统的指南;它们并不是人工智能系统要遵循的明确法则:4

I find it helpful to summarize the approach in the form of three3 principles. When reading these principles, keep in mind that they are intended primarily as a guide to AI researchers and developers in thinking about how to create beneficial AI systems; they are not intended as explicit laws for AI systems to follow:4

  1. 机器的唯一目标就是最大化实现人类的偏好。

  2. The machine’s only objective is to maximize the realization of human preferences.

  3. 机器最初不确定这些偏好是什么。

  4. The machine is initially uncertain about what those preferences are.

  5. 关于人类偏好的信息的最终来源是人类行为。

  6. The ultimate source of information about human preferences is human behavior.

在深入解释细节之前,重要的是要记住我在这些原则中所说的偏好的广泛范围。这里提醒一下我在第 2 章中所写的内容:如果你能以某种方式观看两部电影,每部都足够详细和广泛地描述你未来的生活,以至于每部都构成一次虚拟体验,你可以说出你喜欢哪部电影,或者表示无所谓。因此,这里的偏好是包罗万象的;它们涵盖了你可能关心的一切,任意遥远的未来。5它们是你的:机器不是要识别或采用一组理想的偏好,而是要理解和满足(尽可能)每个人的偏好。

Before delving into more detailed explanations, it’s important to remember the broad scope of what I mean by preferences in these principles. Here’s a reminder of what I wrote in Chapter 2: if you were somehow able to watch two movies, each describing in sufficient detail and breadth a future life you might lead, such that each constitutes a virtual experience, you could say which you prefer, or express indifference. Thus, preferences here are all-encompassing; they cover everything you might care about, arbitrarily far into the future.5 And they are yours: the machine is not looking to identify or adopt one ideal set of preferences but to understand and satisfy (to the extent possible) the preferences of each person.

第一条原则:纯粹利他机器

The first principle: Purely altruistic machines

第一个原则是,机器的唯一目标是最大限度地实现人类的偏好,这是有益机器概念的核心。具体来说,它将有益于人类,而不是蟑螂。这种受益者特定的利益概念是无法回避的。

The first principle, that the machine’s only objective is to maximize the realization of human preferences, is central to the notion of a beneficial machine. In particular, it will be beneficial to humans, rather than to, say, cockroaches. There’s no getting around this recipient-specific notion of benefit.

这一原则意味着机器纯粹是利他主义的——也就是说,它绝对不重视自己的幸福,甚至不重视自己的存在。它可能为了继续为人类做有用的事情而保护自己,或者因为它的主人会对不得不支付维修费用感到不高兴,或者因为看到脏兮兮或损坏的机器人可能会让路人感到有点难过,但不是因为它想活下去。任何对自我保护的偏好都会在机器人内部增加与人类福祉并不严格一致的额外激励。

The principle means that the machine is purely altruistic—that is, it attaches absolutely no intrinsic value to its own well-being or even its own existence. It might protect itself in order to continue doing useful things for humans, or because its owner would be unhappy about having to pay for repairs, or because the sight of a dirty or damaged robot might be mildly distressing to passersby, but not because it wants to be alive. Putting in any preference for self-preservation sets up an additional incentive within the robot that is not strictly aligned with human well-being.

第一条原则的措辞提出了两个至关重要的问题。每个问题都值得用整整一书架来讨论,事实上已经有很多关于这些问题的书了。

The wording of the first principle brings up two questions of fundamental importance. Each merits an entire bookshelf to itself, and in fact many books have already been written on these questions.

第一个问题是,人类是否真的具有有意义或稳定的偏好。事实上,“偏好”的概念是一种理想化,在很多方面与现实不符。例如,我们并非生来就具有成年后的偏好,因此这些偏好必须随着时间而改变。现在,我将假设这种理想化是合理的。稍后,我将研究当我们放弃这种理想化时会发生什么。

The first question is whether humans really have preferences in a meaningful or stable sense. In truth, the notion of a “preference” is an idealization that fails to match reality in several ways. For example, we aren’t born with the preferences we have as adults, so they must change over time. For now, I will assume that the idealization is reasonable. Later, I will examine what happens when we give up the idealization.

第二个问题是社会科学的一个主要问题:鉴于通常不可能确保每个人都得到他们最喜欢的结果——我们不可能都是宇宙之王——机器应该如何权衡多个人的偏好?同样,就目前而言——我保证在下一章中回到这个问题——采取平等对待每个人的简单方法似乎是合理的。这让人想起了 18 世纪功利主义的根源,即“为最多的人谋取最大的幸福”,6而要使这一原则在实践中发挥作用,需要许多注意事项和详细说明。其中最重要的可能是尚未出生的可能庞大的人口问题,以及如何考虑他们的偏好。

The second question is a staple of the social sciences: given that it is usually impossible to ensure that everyone gets their most preferred outcome—we can’t all be Emperor of the Universe—how should the machine trade off the preferences of multiple humans? Again, for the time being—and I promise to return to this question in the next chapter—it seems reasonable to adopt the simple approach of treating everyone equally. This is reminiscent of the roots of eighteenth-century utilitarianism in the phrase “the greatest happiness for the greatest numbers,”6 and there are many caveats and elaborations required to make this work in practice. Perhaps the most important of these is the matter of the possibly vast number of people not yet born, and how their preferences are to be taken into account.

未来人类的问题引出了另一个相关问题:我们如何考虑非人类实体的偏好?也就是说,第一原则是否应该包括动物的偏好?(可能还包括植物的偏好?)这是一个值得辩论的问题,但结果似乎不太可能对人工智能的未来道路产生重大影响。就其价值而言,人类的偏好可以而且确实包括动物福祉的条款,以及直接受益于动物存在的人类福祉的方面。7机器应该关注动物的偏好除此之外,还有一种观点认为,人类应该制造比人类更关心动物的机器,这种观点很难站得住脚。一种更站得住脚的观点是,我们倾向于做出短视的决策——这违背了我们自己的利益——这往往对环境及其动物栖息者造成负面影响。做出不那么短视决策的机器将有助于人类制定更环保的政策。如果未来我们比现在更加重视动物的福祉——这可能意味着牺牲我们自身的一些内在福祉——那么机器也会做出相应的调整。

The issue of future humans brings up another, related question: How do we take into account the preferences of nonhuman entities? That is, should the first principle include the preferences of animals? (And possibly plants too?) This is a question worthy of debate, but the outcome seems unlikely to have a strong impact on the path forward for AI. For what it’s worth, human preferences can and do include terms for the well-being of animals, as well as for the aspects of human well-being that benefit directly from animals’ existence.7 To say that the machine should pay attention to the preferences of animals in addition to this is to say that humans should build machines that care more about animals than humans do, which is a difficult position to sustain. A more tenable position is that our tendency to engage in myopic decision making—which works against our own interests—often leads to negative consequences for the environment and its animal inhabitants. A machine that makes less myopic decisions would help humans adopt more environmentally sound policies. And if, in the future, we give substantially greater weight to the well-being of animals than we currently do—which probably means sacrificing some of our own intrinsic well-being—then machines will adapt accordingly.

第二项原则:谦逊的机器

The second principle: Humble machines

第二个原则是,机器最初不确定人类的偏好是什么,这是创造有益机器的关键。

The second principle, that the machine is initially uncertain about what human preferences are, is the key to creating beneficial machines.

如果机器认为自己完全了解真正的目标,那么它就会一心一意地追求目标。它永远不会问某个行动方案是否可行,因为它已经知道这是实现目标的最佳解决方案。它会忽略人类跳上跳下大喊“停下,你会毁灭世界的!”,因为这些只是说说而已。假设对目标完全了解,机器就会与人类脱钩:人类做什么不再重要,因为机器知道目标并追求目标。

A machine that assumes it knows the true objective perfectly will pursue it single-mindedly. It will never ask whether some course of action is OK, because it already knows it’s an optimal solution for the objective. It will ignore humans jumping up and down screaming, “Stop, you’re going to destroy the world!” because those are just words. Assuming perfect knowledge of the objective decouples the machine from the human: what the human does no longer matters, because the machine knows the goal and pursues it.

另一方面,不确定真正目标的机器会表现出一种谦卑:例如,它会服从人类并允许自己被关掉。它认为,只有当它做错事时,人类才会关掉它——也就是说,做一些与人类偏好相反的事情。根据第一条原则,它想避免这样做,但根据第二条原则,它知道这是可能的,因为它不知道“错误”到底是什么。所以,如果人类确实关掉了机器,那么机器就会避免做错误的事情,而这正是它想要的。换句话说,机器有积极的动机让自己被关掉。它仍然与人类联系在一起,人类是潜在的信息来源,可以帮助它避免错误并做得更好。

On the other hand, a machine that is uncertain about the true objective will exhibit a kind of humility: it will, for example, defer to humans and allow itself to be switched off. It reasons that the human will switch it off only if it’s doing something wrong—that is, doing something contrary to human preferences. By the first principle, it wants to avoid doing that, but, by the second principle, it knows that’s possible because it doesn’t know exactly what “wrong” is. So, if the human does switch the machine off, then the machine avoids doing the wrong thing, and that’s what it wants. In other words, the machine has a positive incentive to allow itself to be switched off. It remains coupled to the human, who is a potential source of information that will allow it to avoid mistakes and do a better job.

自 20 世纪 80 年代以来,不确定性一直是人工智能关注的焦点;事实上,“现代人工智能”一词通常指的是当不确定性最终被承认为现实世界决策中普遍存在的问题时发生的革命。然而,人工智能系统目标中的不确定性却被忽略了。在所有关于效用最大化、目标实现、成本最小化、回报最大化和损失最小化的工作中,人们都假设效用函数、目标、成本函数、回报函数和损失函数是完全已知的。怎么会这样?人工智能界(以及控制理论、运筹学和统计学界)怎么会在如此长的时间内存在如此巨大的盲点,即使在决策的所有其他方面都接受不确定性?8

Uncertainty has been a central concern in AI since the 1980s; indeed the phrase “modern AI” often refers to the revolution that took place when uncertainty was finally recognized as a ubiquitous issue in real-world decision making. Yet uncertainty in the objective of the AI system was simply ignored. In all the work on utility maximization, goal achievement, cost minimization, reward maximization, and loss minimization, it is assumed that the utility function, the goal, the cost function, the reward function, and the loss function are known perfectly. How could this be? How could the AI community (and the control theory, operations research, and statistics communities) have such a huge blind spot for so long, even while embracing uncertainty in all other aspects of decision making?8

人们可以提出一些相当复杂的技术借口,9但我怀疑事实是,除了一些值得尊敬的例外,10人工智能研究人员只是接受了将我们的人类智能概念映射到机器智能的标准模型:人类有目标并追求它们,所以机器应该有目标并追求它们。他们,或者我应该说我们,从未真正研究过这个基本假设。它内置于所有现有的构建智能系统的方法中。

One could make some rather complicated technical excuses,9 but I suspect the truth is that, with some honorable exceptions,10 AI researchers simply bought into the standard model that maps our notion of human intelligence onto machine intelligence: humans have objectives and pursue them, so machines should have objectives and pursue them. They, or should I say we, never really examined this fundamental assumption. It is built into all existing approaches for constructing intelligent systems.

第三项原则:学习预测人类的偏好

The third principle: Learning to predict human preferences

第三个原则是,有关人类偏好的信息的最终来源是人类行为,这有两个目的。

The third principle, that the ultimate source of information about human preferences is human behavior, serves two purposes.

第一个目的是为“人类偏好”这一术语提供明确的基础。根据假设,人类的偏好并不存在于机器中,机器也无法直接观察到它们,但肯定存在机器和人类偏好之间存在某种明确的联系。该原则认为,这种联系是通过观察人类的选择而形成的:我们假设选择以某种(可能非常复杂的)方式与潜在的偏好相关。要了解这种联系为何必不可少,请考虑相反的情况:如果某种人类偏好对人类可能做出的任何实际或假设选择没有任何影响,那么说这种偏好存在可能毫无意义。

The first purpose is to provide a definite grounding for the term human preferences. By assumption, human preferences aren’t in the machine and it cannot observe them directly, but there must still be some definite connection between the machine and human preferences. The principle says that the connection is through the observation of human choices: we assume that choices are related in some (possibly very complicated) way to underlying preferences. To see why this connection is essential, consider the converse: if some human preference had no effect whatsoever on any actual or hypothetical choice the human might make, then it would probably be meaningless to say that the preference exists.

第二个目的是使机器在了解我们想要的东西时变得更有用。(毕竟,如果它对人类的偏好一无所知,那么它对我们就没有用处。)这个想法很简单:人类的选择揭示了有关人类偏好的信息。应用于菠萝披萨和香肠披萨之间的选择,这很简单。应用于未来生活和以影响机器人行为为目标的选择之间的选择,事情会变得更加有趣。在下一章中,我将解释如何制定和解决此类问题。然而,真正的复杂之处在于人类并非完全理性:人类偏好和人类选择之间存在不完美性,如果机器要将人类选择解释为人类偏好的证据,就必须考虑到这些不完美性。

The second purpose is to enable the machine to become more useful as it learns more about what we want. (After all, if it knew nothing about human preferences, it would be of no use to us.) The idea is simple enough: human choices reveal information about human preferences. Applied to the choice between pineapple pizza and sausage pizza, this is straightforward. Applied to choices between future lives and choices made with the goal of influencing the robot’s behavior, things get more interesting. In the next chapter I explain how to formulate and solve such problems. The real complications arise, however, because humans are not perfectly rational: imperfection comes between human preferences and human choices, and the machine must take into account those imperfections if it is to interpret human choices as evidence of human preferences.

我不是那个意思

Not what I mean

在进一步详细阐述之前,我想先消除一些潜在的误解。

Before going into more detail, I want to head off some potential misunderstandings.

第一个也是最常见的误解是,我提议在机器中安装一个我自己设计的单一、理想化的价值体系来指导机器的行为。“你要把谁的价值观放进去?”“谁来决定这些价值观是什么?”或者甚至是“是什么让像拉塞尔这样的西方富裕白人男性顺性别科学家有权决定机器如何编码和发展人类价值观?” 11

The first and most common misunderstanding is that I am proposing to install in machines a single, idealized value system of my own design that guides the machine’s behavior. “Whose values are you going to put in?” “Who gets to decide what the values are?” Or even, “What gives Western, well-off, white male cisgender scientists such as Russell the right to determine how the machine encodes and develops human values?”11

我认为这种混淆部分来自于价值的常识意义与经济学、人工智能和运筹学中更技术性的意义之间的不幸冲突。在日常用法中,价值是人们用来帮助​​解决道德困境的东西;另一方面,作为技术术语,价值大致与效用同义,效用衡量了从披萨到天堂的任何事物的可取程度。我想要的意义是技术意义:我只想确保机器给我正确的披萨,并且不会意外毁灭人类。(找到我的钥匙将是一个意想不到的收获。)为了避免这种混淆,这些原则谈论的是人类的偏好不是人类的价值观,因为前一个术语似乎避开了对道德的判断性先入之见。

I think this confusion comes partly from an unfortunate conflict between the commonsense meaning of value and the more technical sense in which it is used in economics, AI, and operations research. In ordinary usage, values are what one uses to help resolve moral dilemmas; as a technical term, on the other hand, value is roughly synonymous with utility, which measures the degree of desirability of anything from pizza to paradise. The meaning I want is the technical one: I just want to make sure the machines give me the right pizza and don’t accidentally destroy the human race. (Finding my keys would be an unexpected bonus.) To avoid this confusion, the principles talk about human preferences rather than human values, since the former term seems to steer clear of judgmental preconceptions about morality.

当然,“代入价值观”正是我要说的我们应该避免的错误,因为获得完全正确的价值观(或偏好)非常困难,而错误则可能带来灾难性的后果。我建议,机器学习可以更好地预测每个人更喜欢哪种生活,同时意识到预测具有高度的不确定性和不完整性。原则上,机器可以学习数十亿种不同的预测偏好模型,地球上数十亿人每人各有一种。对于未来的人工智能系统来说,这并不是过分的要求,因为当今的 Facebook 系统已经维护了超过 20 亿个个人资料。

“Putting in values” is, of course, exactly the mistake I am saying we should avoid, because getting the values (or preferences) exactly right is so difficult and getting them wrong is potentially catastrophic. I am proposing instead that machines learn to predict better, for each person, which life that person would prefer, all the while being aware that the predictions are highly uncertain and incomplete. In principle, the machine can learn billions of different predictive preference models, one for each of the billions of people on Earth. This is really not too much to ask for the AI systems of the future, given that present-day Facebook systems are already maintaining more than two billion individual profiles.

一个相关的误解是,目标是让机器具备“伦理”或“道德价值观”,使它们能够解决道德困境。人们经常会提到所谓的电车难题,人们必须选择是否杀死一个人以拯救其他人,因为这与自动驾驶汽车有关。然而,道德困境的重点在于,它们就是困境:双方都有很好的论据。人类的生存不是道德困境。机器可以用错误的方式(无论是什么方式)解决大多数道德困境,并且不会对人类造成灾难性的影响。13

A related misunderstanding is that the goal is to equip machines with “ethics” or “moral values” that will enable them to resolve moral dilemmas. Often, people bring up the so-called trolley problems,12 where one has to choose whether to kill one person in order to save others, because of their supposed relevance to self-driving cars. The whole point of moral dilemmas, however, is that they are dilemmas: there are good arguments on both sides. The survival of the human race is not a moral dilemma. Machines could solve most moral dilemmas the wrong way (whatever that is) and still have no catastrophic impact on humanity.13

另一个常见的假设是,遵循这三个原则的机器会采纳它们观察到并学习的邪恶人类的所有罪恶。当然,我们中的许多人的选择都有些不尽如人意,但没有理由认为研究我们动机的机器会做出同样的选择,就像犯罪学家不会成为罪犯一样。举个例子,一个腐败的政府官员要求贿赂才能批准建筑许可,因为他微薄的薪水无法支付孩子上大学的费用。观察这种行为的机器不会学会收受贿赂;它会了解到,这位官员和许多其他人一样,非常希望自己的孩子受教育并取得成功。它会找到不降低他人福祉的方法来帮助他。这并不是说所有邪恶行为对机器来说都是没有问题的——例如,机器可能需要区别对待那些积极喜欢别人受苦的人。

Another common supposition is that machines that follow the three principles will adopt all the sins of the evil humans they observe and learn from. Certainly, there are many of us whose choices leave something to be desired, but there is no reason to suppose that machines who study our motivations will make the same choices, any more than criminologists become criminals. Take, for example, the corrupt government official who demands bribes to approve building permits because his paltry salary won’t pay for his children to go to university. A machine observing this behavior will not learn to take bribes; it will learn that the official, like many other people, has a very strong desire for his children to be educated and successful. It will find ways to help him that don’t involve lowering the well-being of others. This is not to say that all cases of evil behavior are unproblematic for machines—for example, machines may need to treat differently those who actively prefer the suffering of others.

乐观的理由

Reasons for Optimism

简而言之,我认为如果我们想控制日益智能化的机器,就需要将人工智能引向一个全新的方向。我们需要摆脱 20 世纪技术发展的主导思想之一:优化给定目标的机器。我经常被问到,鉴于人工智能和相关学科的标准模型背后有着巨大的发展势头,为什么我认为这是可行的。事实上,我对此非常乐观。

In a nutshell, I am suggesting that we need to steer AI in a radically new direction if we want to retain control over increasingly intelligent machines. We need to move away from one of the driving ideas of twentieth-century technology: machines that optimize a given objective. I am often asked why I think this is even remotely feasible, given the huge momentum behind the standard model in AI and related disciplines. In fact, I am quite optimistic that it can be done.

乐观的第一个原因是,开发服从人类并逐渐适应用户偏好和意图的人工智能系统具有强大的经济动机。这样的系统将非常受欢迎:它们可以表现出的行为范围远远超过具有固定、已知目标的机器。它们会在适当的时候向人类提问或征求许可;它们会进行“试运行”,看看我们是否喜欢它们提议做的事情;它们做错事时会接受纠正。另一方面,做不到这一点的系统将产生严重后果。到目前为止,人工智能系统的愚蠢和有限范围使我们免受这些后果的影响,但这种情况将会改变。例如,想象一下,未来的某个家用机器人负责在你工作到很晚的时候照顾你的孩子。孩子们饿了,但冰箱是空的。然后机器人注意到了猫。唉,机器人知道猫的营养价值,却不知道它的情感价值。短短几个小时内,关于疯狂机器人和烤猫的头条新闻就占据了全球媒体,整个家用机器人行业都倒闭了。

The first reason for optimism is that there are strong economic incentives to develop AI systems that defer to humans and gradually align themselves to user preferences and intentions. Such systems will be highly desirable: the range of behaviors they can exhibit is simply far greater than that of machines with fixed, known objectives. They will ask humans questions or ask for permission when appropriate; they will do “trial runs” to see if we like what they propose to do; they will accept correction when they do something wrong. On the other hand, systems that fail to do this will have severe consequences. Up to now, the stupidity and limited scope of AI systems has protected us from these consequences, but that will change. Imagine, for example, some future domestic robot charged with looking after your children while you are working late. The children are hungry, but the refrigerator is empty. Then the robot notices the cat. Alas, the robot understands the cat’s nutritional value but not its sentimental value. Within a few short hours, headlines about deranged robots and roasted cats are blanketing the world’s media and the entire domestic-robot industry is out of business.

一个行业参与者可能因为粗心的设计而毁掉整个行业,这种可能性为组建安全导向的行业联盟和执行安全标准提供了强大的经济动机。人工智能伙伴关系组织的成员包括几乎所有世界领先的科技公司,该组织已同意合作,以确保“人工智能研究和技术是稳健、可靠、值得信赖的,并在安全约束范围内运行”。据我所知,所有主要参与者都在公开文献中发表他们的安全导向研究。因此,经济激励早在我们达到人类水平的人工智能之前就已存在,而且只会随着时间的推移而增强。此外,同样的合作动态可能正在国际层面开始——例如,中国政府的既定政策是“合作先发制人地防止人工智能的威胁” 。14

The possibility that one industry player could destroy the entire industry through careless design provides a strong economic motivation to form safety-oriented industry consortia and to enforce safety standards. Already, the Partnership on AI, which includes as members nearly all the world’s leading technology companies, has agreed to cooperate to ensure that “AI research and technology is robust, reliable, trustworthy, and operates within secure constraints.” To my knowledge, all the major players are publishing their safety-oriented research in the open literature. Thus, the economic incentive is in operation long before we reach human-level AI and will only strengthen over time. Moreover, the same cooperative dynamic may be starting at the international level—for example, the stated policy of the Chinese government is to “cooperate to preemptively prevent the threat of AI.”14

乐观的第二个原因是,用于了解人类偏好的原始数据(即人类行为的例子)非常丰富。这些数据不仅来自数十亿台机器通过摄像头、键盘和触摸屏直接观察,这些机器彼此分享有关数十亿人类的数据(当然要受到隐私限制),而且还来自间接观察。最明显的间接证据是人类大量记录的书籍、电影、电视和广播,这些记录几乎完全与人类行为有关。人们做事(其他人对此感到不满)。即使是最早、最乏味的苏美尔和埃及用铜锭换取大麦袋的记录也让我们了解到人类对不同商品的偏好。

A second reason for optimism is that the raw data for learning about human preferences—namely, examples of human behavior—are so abundant. The data come not just in the form of direct observation via camera, keyboard, and touch screen by billions of machines sharing data with one another about billions of humans (subject to privacy constraints, of course) but also in indirect form. The most obvious kind of indirect evidence is the vast human record of books, films, and television and radio broadcasts, which is almost entirely concerned with people doing things (and other people being upset about it). Even the earliest and most tedious Sumerian and Egyptian records of copper ingots being traded for sacks of barley give some insight into human preferences for different commodities.

当然,解读这些原始材料会遇到很多困难,这些材料包括宣传、虚构、疯子的胡言乱语,甚至政客和总统的言论,但机器当然没有理由全盘接受。机器可以而且应该将来自其他智能实体的所有通信解读为游戏中的动作,而不是事实陈述;在某些游戏中,例如一个人和一台机器的合作游戏中,人类有说真话的动机,但在许多其他情况下,人类有撒谎的动机。当然,无论诚实与否,人类都可能被自己的信念所欺骗。

There are, of course, difficulties involved in interpreting this raw material, which includes propaganda, fiction, the ravings of lunatics, and even the pronouncements of politicians and presidents, but there is certainly no reason for the machine to take it all at face value. Machines can and should interpret all communications from other intelligent entities as moves in a game rather than as statements of fact; in some games, such as cooperative games with one human and one machine, the human has an incentive to be truthful, but in many other situations there are incentives to be dishonest. And of course, whether honest or dishonest, humans may be deluded in their own beliefs.

还有第二种间接证据摆在我们面前:我们创造世界的方式。15我们创造世界的方式因为我们喜欢这样。(显然,这并不完美!)现在,想象一下你是一个外星人,当所有人类都在度假时来到地球。当你窥视他们的房子时,你能开始了解人类喜好的基本知识吗?地毯放在地板上是因为我们喜欢在柔软、温暖的表面上行走,不喜欢响亮的脚步声;花瓶放在桌子中间而不是边缘,因为我们不想让它们掉下来摔碎;等等——一切非自然安排的事物都为居住在这个星球上的奇怪两足动物的好恶提供了线索。

There is a second kind of indirect evidence that is staring us in the face: the way we have made the world.15 We made it that way because—very roughly—we like it that way. (Obviously, it’s not perfect!) Now, imagine you are an alien visiting Earth while all the humans are away on holiday. As you peer inside their houses, can you begin to grasp the basics of human preferences? Carpets are on floors because we like to walk on soft, warm surfaces and we don’t like loud footsteps; vases are on the middle of the table rather than the edge because we don’t want them to fall and break; and so on—everything that isn’t arranged by nature itself provides clues to the likes and dislikes of the strange bipedal creatures who inhabit this planet.

谨慎的理由

Reasons for Caution

如果你一直在关注自动驾驶汽车的发展,你可能会发现人工智能伙伴关系组织在人工智能安全方面做出的合作承诺并不那么令人放心。这个领域的竞争非常激烈,对于一些非常优秀的原因:第一个推出全自动驾驶汽车的汽车制造商将获得巨大的市场优势;这种优势将自我强化,因为制造商将能够更快地收集更多数据以改善系统性能;如果另一家公司在 Uber 之前推出全自动驾驶出租车,Uber 等叫车公司将很快破产。这导致了一场高风险的竞赛,在这场竞赛中,谨慎和精心设计似乎不如花哨的演示、人才争夺和过早推出更重要。

You may find the Partnership on AI’s promises of cooperation on AI safety less than reassuring if you have been following progress in self-driving cars. That field is ruthlessly competitive, for some very good reasons: the first car manufacturer to release a fully autonomous vehicle will gain a huge market advantage; that advantage will be self-reinforcing because the manufacturer will be able to collect more data more quickly to improve the system’s performance; and ride-hailing companies such as Uber would quickly go out of business if another company were to roll out fully autonomous taxis before Uber does. This has led to a high-stakes race in which caution and careful engineering appear to be less important than snazzy demos, talent grabs, and premature rollouts.

因此,生死攸关的经济竞争为人们提供了在安全问题上偷工减料以期赢得竞争的动力。生物学家保罗·伯格在 2008 年的一篇回顾性论文中写道,16他在 1975 年的阿西洛马会议中与他人共同组织了这次会议,该会议最终导致人类基因改造被暂停。

Thus, life-or-death economic competition provides an impetus to cut corners on safety in the hope of winning the race. In a 2008 retrospective paper on the 1975 Asilomar conference that he co-organized—the conference that led to a moratorium on genetic modification of humans—the biologist Paul Berg wrote,16

阿西洛马事件给所有科学界都上了一课:应对新兴知识或早期技术引发的担忧的最佳方式是让来自公共资助机构的科学家与广大公众就最佳监管方式达成共识——越早越好。一旦来自企业的科学家开始主宰研究事业,那就太晚了。

There is a lesson in Asilomar for all of science: the best way to respond to concerns created by emerging knowledge or early-stage technologies is for scientists from publicly funded institutions to find common cause with the wider public about the best way to regulate—as early as possible. Once scientists from corporations begin to dominate the research enterprise, it will simply be too late.

经济竞争不仅发生在企业之间,也发生在国家之间。最近,美国、中国、法国、英国和欧盟宣布在人工智能领域投资数十亿美元,这无疑表明没有一个大国愿意落后。2017 年,俄罗斯总统弗拉基米尔·普京表示:“谁成为 [人工智能] 的领导者,谁就是世界的统治者。” 17这种分析基本上是正确的。正如我们在第 3 章中看到的那样,先进的人工智能将大大提高几乎所有领域的生产力和创新率。如果不共享,它将使其拥有者胜过任何竞争对手国家或集团。

Economic competition occurs not just between corporations but also between nations. A recent flurry of announcements of multibillion-dollar national investments in AI from the United States, China, France, Britain, and the EU certainly suggests that none of the major powers wants to be left behind. In 2017, Russian president Vladimir Putin said, “The one who becomes the leader in [AI] will be the ruler of the world.”17 This analysis is essentially correct. Advanced AI would, as we saw in Chapter 3, lead to greatly increased productivity and rates of innovation in almost all areas. If not shared, it would allow its possessor to outcompete any rival nation or bloc.

尼克·博斯特罗姆 (Nick Bostrom) 在《超级智能》一书中警告不要有这种动机。国家竞争就像企业竞争一样,往往更注重原始能力的进步,而不是控制问题。然而,普京也许读过博斯特罗姆的书;他接着说:“如果有人赢得垄断地位,那将是非常不可取的。”这也将是毫无意义的,因为人类级别的人工智能不是零和游戏,分享它也不会有任何损失。另一方面,如果不先解决控制问题,就争相成为第一个实现人类级别人工智能的人,这是一场负和游戏。每个人的收益都是负无穷。

Nick Bostrom, in Superintelligence, warns against exactly this motivation. National competition, just like corporate competition, would tend to focus more on advances in raw capabilities and less on the problem of control. Perhaps, however, Putin has read Bostrom; he went on to say, “It would be strongly undesirable if someone wins a monopolist position.” It would also be rather pointless, because human-level AI is not a zero-sum game and nothing is lost by sharing it. On the other hand, competing to be the first to achieve human-level AI, without first solving the control problem, is a negative-sum game. The payoff for everyone is minus infinity.

人工智能研究人员能够影响全球人工智能政策演变的程度非常有限。我们可以指出可能带来经济和社会效益的应用;我们可以警告可能出现的滥用行为,例如监视和武器;我们可以为未来发展的可能路径及其影响提供路线图。也许我们能做的最重要的事情是设计出尽可能安全且对人类有益的人工智能系统。只有这样,尝试对人工智能进行普遍监管才有意义。

There’s only a limited amount that AI researchers can do to influence the evolution of global policy on AI. We can point to possible applications that would provide economic and social benefits; we can warn about possible misuses such as surveillance and weapons; and we can provide roadmaps for the likely path of future developments and their impacts. Perhaps the most important thing we can do is to design AI systems that are, to the extent possible, provably safe and beneficial for humans. Only then will it make sense to attempt general regulation of AI.

8

8

可证明有益的人工智能

PROVABLY BENEFICIAL AI

如果我们要按照新思路重建人工智能,基础必须牢固。当人类的未来受到威胁时,希望和良好意愿——以及教育举措、行业行为准则、立法和做正确事情的经济激励——是不够的。所有这些都是会出错的,而且经常会失败。在这种情况下,我们希望精确的定义和严格的逐步数学证明能够提供无可辩驳的保证。

If we are going to rebuild AI along new lines, the foundations must be solid. When the future of humanity is at stake, hope and good intentions—and educational initiatives and industry codes of conduct and legislation and economic incentives to do the right thing—are not enough. All of these are fallible, and they often fail. In such situations, we look to precise definitions and rigorous step-by-step mathematical proofs to provide incontrovertible guarantees.

这是一个好的开始,但我们需要做的还不止于此。我们需要尽可能地确保所保证的确实是我们想要的,并且证明中的假设确实是真实的。证明本身属于为专家撰写的期刊论文,但我认为了解什么是证明以及它们在真正的安全方面能提供什么和不能提供什么仍然很有用。本章标题中的“可证明的有益”是一种愿望,而不是承诺,但它是正确的愿望。

That’s a good start, but we need more. We need to be sure, to the extent possible, that what is guaranteed is actually what we want and that the assumptions going into the proof are actually true. The proofs themselves belong in journal papers written for specialists, but I think it is useful nonetheless to understand what proofs are and what they can and cannot provide in the way of real safety. The “provably beneficial” in the title of the chapter is an aspiration rather than a promise, but it is the right aspiration.

数学保证

Mathematical Guarantees

我们最终会想要证明定理,证明某种特定的人工智能系统设计方法能够确保它们对人类有益。定理只是断言的别称,它表述得足够精确,以便可以检查其在任何特定情况下的真实性。也许最著名的定理是费马大定理,它是由法国数学家皮埃尔·德·费马于 1637 年提出的,经过 357 年的努力(并非全部由威尔斯完成),最终于 1994 年由安德鲁·怀尔斯证明。1这个定理可以用一行写出来,但证明却有超过一百页的密集数学知识。

We will want, eventually, to prove theorems to the effect that a particular way of designing AI systems ensures that they will be beneficial to humans. A theorem is just a fancy name for an assertion, stated precisely enough so that its truth in any particular situation can be checked. Perhaps the most famous theorem is Fermat’s Last Theorem, which was conjectured by the French mathematician Pierre de Fermat in 1637 and finally proved by Andrew Wiles in 1994 after 357 years of effort (not all of it by Wiles).1 The theorem can be written in one line, but the proof is over one hundred pages of dense mathematics.

证明从公理开始,公理是简单假设其真实性的断言。通常,公理只是定义,例如费马定理所需的整数、加法和幂的定义。证明从公理开始,通过逻辑上无可辩驳的步骤进行,添加新的断言,直到定理本身作为其中一步的结果而建立。

Proofs begin from axioms, which are assertions whose truth is simply assumed. Often, the axioms are just definitions, such as the definitions of integers, addition, and exponentiation needed for Fermat’s theorem. The proof proceeds from the axioms by logically incontrovertible steps, adding new assertions until the theorem itself is established as a consequence of one of the steps.

这是一个相当明显的定理,它几乎立即从整数和加法的定义中得出:1 + 2 = 2 + 1。我们把它称为罗素定理。这不是什么发现。另一方面,费马大定理感觉像是某种全新的东西——发现了一些以前未知的东西。然而,差别只是程度的问题。罗素定理和费马定理的真实性已经包含在公理中。证明只是将已经隐含的内容明确化。它们可以长或短,但它们并没有增加任何新内容。定理的好坏取决于其中的假设。

Here’s a fairly obvious theorem that follows almost immediately from the definitions of integers and addition: 1 + 2 = 2 + 1. Let’s call this Russell’s theorem. It’s not much of a discovery. On the other hand, Fermat’s Last Theorem feels like something completely new—a discovery of something previously unknown. The difference, however, is just a matter of degree. The truth of both Russell’s and Fermat’s theorems is already contained in the axioms. Proofs merely make explicit what was already implicit. They can be long or short, but they add nothing new. The theorem is only as good as the assumptions that go into it.

在数学方面,这没问题,因为数学是关于我们定义的抽象对象——数字、集合等等。公理是正确的,因为我们这么说。另一方面,如果你想证明现实世界中的某些东西——例如,人工智能系统这样设计不会故意害死你——你的公理在现实世界中必须是真实的。如果它们不真实,你就证明了一个关于想象世界的一些东西。

That’s fine when it comes to mathematics, because mathematics is about abstract objects that we define—numbers, sets, and so on. The axioms are true because we say so. On the other hand, if you want to prove something about the real world—for example, that AI systems designed like so won’t kill you on purpose—your axioms have to be true in the real world. If they aren’t true, you’ve proved something about an imaginary world.

科学和工程学有着悠久而光荣的传统,即证明虚拟世界的结果。例如,在结构工程中,人们可能会看到这样的数学分析:“设 AB 为刚性梁……”这里的“刚性”一词并不意味着“由钢等坚硬的物质制成”,而是意味着“强度无限”,因此它根本不会弯曲。刚性梁并不存在,所以这是一个虚拟世界。诀窍在于知道可以偏离现实世界多远,同时仍能获得有用的结果。例如,如果刚性梁假设允许工程师计算包含梁的结构中的力,并且这些力足够小,只能使真实的钢梁弯曲很小的量,那么工程师就可以有理由相信分析将从虚拟世界转移到现实世界。

Science and engineering have a long and honorable tradition of proving results about imaginary worlds. In structural engineering, for example, one might see a mathematical analysis that begins, “Let AB be a rigid beam. . . .” The word rigid here doesn’t mean “made of something hard like steel”; it means “infinitely strong,” so that it doesn’t bend at all. Rigid beams do not exist, so this is an imaginary world. The trick is to know how far one can stray from the real world and still obtain useful results. For example, if the rigid-beam assumption allows an engineer to calculate the forces in a structure that includes the beam, and those forces are small enough to bend a real steel beam by only a tiny amount, then the engineer can be reasonably confident that the analysis will transfer from the imaginary world to the real world.

优秀的工程师能够感知到这种传递何时会失效。例如,如果梁受到压缩,两端受到巨大的推力,那么即使很小的弯曲也可能导致更大的横向力,从而导致更大的弯曲,依此类推,最终导致灾难性的失效。在这种情况下,分析将重新进行,“设 AB 为刚度为K 的柔性梁……”当然,这仍然是一个虚构的世界,因为真实的梁并不具有均匀的刚度;相反,它们具有微观缺陷,如果梁反复弯曲,则会导致形成裂缝。消除不切实际的假设的过程将一直持续,直到工程师确信其余假设在现实世界中足够真实。之后,可以在现实世界中测试工程系统;但测试结果只是测试结果。它们不能证明同一系统在其他情况下也能正常工作,也不能证明系统的其他实例将以与原始系统相同的方式运行。

A good engineer develops a sense for when this transfer might fail— for example, if the beam is under compression, with huge forces pushing on it from each end, then even a tiny amount of bending might lead to greater lateral forces causing more bending, and so on, resulting in catastrophic failure. In that case, the analysis is redone with “Let AB be a flexible beam with stiffness K. . . .” This is still an imaginary world, of course, because real beams do not have uniform stiffness; instead, they have microscopic imperfections that can lead to cracks forming if the beam is subject to repeated bending. The process of removing unrealistic assumptions continues until the engineer is fairly confident that the remaining assumptions are true enough in the real world. After that, the engineered system can be tested in the real world; but the test results are just that. They do not prove that the same system will work in other circumstances or that other instances of the system will behave the same way as the original.

计算机科学中假设失败的典型例子之一是网络安全。在这个领域,大量的数学分析可以表明某些数字协议是可证明安全的——例如,当你在 Web 应用程序中输入密码时,你要确保密码在传输前已加密,以便网络上的窃听者无法读取你的密码。此类数字系统通常可证明是安全的,但在现实中仍容易受到攻击。这里的错误假设是,这是一个数字过程。但事实并非如此。它在真实的物理世界中运行。通过监听键盘声音或测量为你的台式计算机供电的电线上的电压,攻击者可以“听到”你的密码或观察密码处理过程中发生的加密/解密计算。网络安全社区现在正在应对这些所谓的侧信道攻击——例如,编写加密代码,无论加密什么消息都会产生相同的电压波动。

One of the classic examples of assumption failure in computer science comes from cybersecurity. In that field, a huge amount of mathematical analysis goes into showing that certain digital protocols are provably secure—for example, when you type a password into a Web application, you want to be sure that it is encrypted before transmission so that someone eavesdropping on the network cannot read your password. Such digital systems are often provably secure but still vulnerable to attack in reality. The false assumption here is that this is a digital process. It isn’t. It operates in the real, physical world. By listening to the sound of your keyboard or measuring voltages on the electrical line that supplies power to your desktop computer, an attacker can “hear” your password or observe the encryption/decryption calculations that are occurring as it is processed. The cybersecurity community is now responding to these so-called side-channel attacks—for example, by writing encryption code that produces the same voltage fluctuations regardless of what message is being encrypted.

让我们看看我们最终想要证明的关于对人类有益的机器的定理类型。一种类型可能是这样的:

Let’s look at the kind of theorem we would like eventually to prove about machines that are beneficial to humans. One type might go something like this:

假设一台机器有组件ABC ,它们像这样互相连接,并与环境像这样连接,并具有内部学习算法l Al Bl C,这些算法优化内部反馈奖励r Ar Br C ,其定义如下以及 [其他一些条件]……那么,很有可能,机器的行为在价值上(对人类而言)将非常接近任何具有相同计算和物理能力的机器上可实现的最佳行为。

Suppose a machine has components A, B, C, connected to each other like so and to the environment like so, with internal learning algorithms lA, lB, lC that optimize internal feedback rewards rA, rB, rC defined like so, and [a few more conditions] . . . then, with very high probability, the machine’s behavior will be very close in value (for humans) to the best possible behavior realizable on any machine with the same computational and physical capabilities.

这里的要点是,无论组件变得多么智能,这样的定理都应该成立—​​—也就是说,容器永远不会漏水,机器也永远对人类有益。

The main point here is that such a theorem should hold regardless of how smart the components become—that is, the vessel never springs a leak and the machine always remains beneficial to humans.

关于这种定理,还有三点值得一提。首先,我们不能试图证明机器会为我们做出最优(甚至接近最优)的行为,因为这几乎从计算上来说,这当然是不可能的。例如,我们可能希望机器完美地下围棋,但有充分的理由相信,在任何物理上可实现的机器上,这都不可能在实际时间内完成。现实世界中的最佳行为甚至更不可行。因此,该定理说的是“最佳可能”而不是“最优”。

There are three other points worth making about this kind of theorem. First, we cannot try to prove that the machine produces optimal (or even near-optimal) behavior on our behalf, because that’s almost certainly computationally impossible. For example, we might want the machine to play Go perfectly, but there is good reason to believe that cannot be done in any practical amount of time on any physically realizable machine. Optimal behavior in the real world is even less feasible. Hence, the theorem says “best possible” rather than “optimal.”

其次,我们说“概率非常高……非常接近”,因为这通常是学习型机器所能达到的最佳结果。例如,如果机器正在学习为我们玩轮盘赌,并且球连续四十次落在零,机器可能会合理地认为赌桌被操纵了,并据此下注。但这可能是偶然发生的;因此,总是存在很小的——也许微不足道的——被异常事件误导的可能性。最后,我们距离能够证明任何这样的定理对于在现实世界中运行的真正智能的机器来说还有很长的路要走!

Second, we say “very high probability . . . very close” because that’s typically the best that can be done with machines that learn. For example, if the machine is learning to play roulette for us and the ball lands in zero forty times in a row, the machine might reasonably decide the table was rigged and bet accordingly. But it could have happened by chance; so there is always a small—perhaps vanishingly small—chance of being misled by freak occurrences. Finally, we are a long way from being able to prove any such theorem for really intelligent machines operating in the real world!

人工智能中也有类似旁道攻击的情况。例如,该定理以“假设一台机器有组件ABC,彼此连接的方式如下……”开头。这是计算机科学中所有正确性定理的典型特征:它们以被证明正确的程序的描述开头。在人工智能中,我们通常区分代理(进行决策的程序)和环境(代理对其采取行动的环境)。由于我们设计了代理,因此假设它具有我们赋予它的结构似乎是合理的。为了更加安全,我们可以证明它的学习过程只能以某些不会引起问题的有限方式修改其程序。这就够了吗?不够。与旁道攻击一样,假设程序在数字系统中运行是不正确的。即使学习算法本质上无法通过数字方式覆盖自己的代码,但它可能会学会说服人类对其进行“脑部手术”——违反代理/环境区分并通过物理方式更改代码。2

There are also analogs of the side-channel attack in AI. For example, the theorem begins with “Suppose a machine has components A, B, C, connected to each other like so. . . .” This is typical of all correctness theorems in computer science: they begin with a description of the program being proved correct. In AI, we typically distinguish between the agent (the program doing the deciding) and the environment (on which the agent acts). Since we design the agent, it seems reasonable to assume that it has the structure we give it. To be extra safe, we can prove that its learning processes can modify its program only in certain circumscribed ways that cannot cause problems. Is this enough? No. As with side-channel attacks, the assumption that the program operates within a digital system is incorrect. Even if a learning algorithm is constitutionally incapable of overwriting its own code by digital means, it may, nonetheless, learn to persuade humans to do “brain surgery” on it—to violate the agent/environment distinction and change the code by physical means.2

与结构工程师对刚性梁的推理不同,我们对最终将是可证明有益的人工智能定理的基础。例如,在本章中,我们通常会假设一个理性的人。这有点像假设一根刚性梁,因为现实中没有完全理性的人。(然而,情况可能更糟,因为人类根本就不理性。)我们可以证明的定理似乎提供了一些见解,这些见解在人类行为中引入一定程度的随机性后仍然存在,但当我们考虑真实人类的一些复杂性时会发生什么,目前还远不清楚。

Unlike the structural engineer reasoning about rigid beams, we have very little experience with the assumptions that will eventually underlie theorems about provably beneficial AI. In this chapter, for example, we will typically be assuming a rational human. This is a bit like assuming a rigid beam, because there are no perfectly rational humans in reality. (It’s probably much worse, however, because humans are not even close to being rational.) The theorems we can prove seem to provide some insights, and the insights survive the introduction of a certain degree of randomness in human behavior, but it is as yet far from clear what happens when we consider some of the complexities of real humans.

因此,我们必须非常小心地检查我们的假设。当安全性证明成功时,我们需要确保它不是因为我们做出了不切实际的强假设或因为安全性的定义太弱而成功。当安全性证明失败时,我们需要抵制强化假设以使证明通过的诱惑——例如,通过添加程序代码保持不变的假设。相反,我们需要加强人工智能系统的设计——例如,确保它没有动机去修改其自身代码的关键部分。

So, we are going to have to be very careful in examining our assumptions. When a proof of safety succeeds, we need to make sure it’s not succeeding because we have made unrealistically strong assumptions or because the definition of safety is too weak. When a proof of safety fails, we need to resist the temptation to strengthen the assumptions to make the proof go through—for example, by adding the assumption that the program’s code remains fixed. Instead, we need to tighten up the design of the AI system—for example, by ensuring that it has no incentive to modify critical parts of its own code.

我将某些假设称为 OWMAWGH 假设,代表“否则我们最好回家”。也就是说,如果这些假设是错误的,那么游戏就结束了,我们无能为力。例如,可以合理地假设宇宙按照恒定且可辨别的规律运行。如果事实并非如此,我们将无法保证学习过程(即使是非常复杂的学习过程)会起作用。另一个基本假设是人类关心发生的事情;如果不是,可证明有益的人工智能就没有目的,因为有益没有意义。在这里,关心意味着对未来有大致连贯且或多或少稳定的偏好。在下一章中,我将研究人类偏好可塑性的后果,这对可证明有益的人工智能的概念本身提出了严峻的哲学挑战。

There are some assumptions that I call OWMAWGH assumptions, standing for “otherwise we might as well go home.” That is, if these assumptions are false, the game is up and there is nothing to be done. For example, it is reasonable to assume that the universe operates according to constant and somewhat discernible laws. If this is not the case, we will have no assurance that learning processes—even very sophisticated ones—will work at all. Another basic assumption is that humans care about what happens; if not, provably beneficial AI has no purpose because beneficial has no meaning. Here, caring means having roughly coherent and more-or-less stable preferences about the future. In the next chapter, I examine the consequences of plasticity in human preferences, which presents a serious philosophical challenge to the very idea of provably beneficial AI.

现在,我重点讨论最简单的情况:一个只有一个人和一个机器人的世界。这个案例有助于介绍基本概念,但它也机器人本身就很有用:你可以认为人类代表全人类,机器人代表所有机器。当考虑多个人类和机器时,就会出现额外的复杂情况。

For now, I focus on the simplest case: a world with one human and one robot. This case serves to introduce the basic ideas, but it’s also useful in its own right: you can think of the human as standing in for all of humanity and the robot as standing in for all machines. Additional complications arise when considering multiple humans and machines.

从行为中学习偏好

Learning Preferences from Behavior

经济学家通过向人类受试者提供选择来了解他们的偏好。3这项技术广泛应用于产品设计、营销和交互式电子商务系统。例如,通过向测试对象提供具有不同油漆颜色、座位安排、后备箱尺寸、电池容量、杯架等的汽车选择,汽车设计师可以了解人们对各种汽车功能的关注程度以及他们愿意为这些功能支付多少钱。另一个重要的应用是在医学领域,肿瘤学家在考虑可能的截肢手术时,可能希望评估患者在行动能力和预期寿命之间的偏好。当然,披萨店想知道人们愿意为香肠披萨支付比普通披萨多多少钱。

Economists elicit preferences from human subjects by offering them choices.3 This technique is widely used in product design, marketing, and interactive e-commerce systems. For example, by offering test subjects choices among cars with different paint colors, seating arrangements, trunk sizes, battery capacities, cup holders, and so on, a car designer learns how much people care about various car features and how much they are willing to pay for them. Another important application is in the medical domain, where an oncologist considering a possible limb amputation might want to assess the patient’s preferences between mobility and life expectancy. And of course, pizza restaurants want to know how much more someone is willing to pay for sausage pizza than plain pizza.

偏好引出通常只考虑在对象之间做出的单一选择,而这些对象的价值被认为对主体来说是立即可见的。如何将其扩展到未来生活之间的偏好并不明显。为此,我们(和机器)需要从对一段时间内的行为的观察中学习——涉及多种选择和不确定结果的行为。

Preference elicitation typically considers only single choices made between objects whose value is assumed to be immediately apparent to the subject. It’s not obvious how to extend it to preferences between future lives. For that, we (and machines) need to learn from observations of behavior over time—behavior that involves multiple choices and uncertain outcomes.

1997 年初,我与同事迈克尔·迪金森 (Michael Dickinson) 和鲍勃·富尔 (Bob Full) 讨论了如何应用机器学习的思想来理解动物的运动行为。迈克尔对果蝇的翅膀运动进行了细致的研究。鲍勃特别喜欢爬行动物,他为蟑螂建造了一个小型跑步机,以观察它们的步态如何随着速度而变化。我们认为,也许可以使用强化学习来训练机器人或模拟昆虫来重现这些复杂的行为。我们面临的问题是,我们不知道该使用什么奖励信号。苍蝇和蟑螂在优化什么?没有这些信息,我们就无法应用强化学习来训练虚拟昆虫,所以我们陷入了困境。

Early in 1997, I was involved in discussions with my colleagues Michael Dickinson and Bob Full about ways in which we might be able to apply ideas from machine learning to understand the locomotive behavior of animals. Michael studied in exquisite detail the wing motions of fruit flies. Bob was especially fond of creepy-crawlies and had built a little treadmill for cockroaches to see how their gait changed with speed. We thought it might be possible to use reinforcement learning to train a robotic or simulated insect to reproduce these complex behaviors. The problem we faced was that we didn’t know what reward signal to use. What were the flies and cockroaches optimizing? Without that information, we couldn’t apply reinforcement learning to train the virtual insect, so we were stuck.

有一天,我正从伯克利的家走在通往当地超市的路上。这条路有一段下坡,我注意到,我相信大多数人也注意到了,这条坡让我的走路方式发生了轻微的变化。此外,几十年来的小地震导致路面不平整,这也导致了额外的步态变化,包括我的脚抬得更高,由于地面高度不可预测,我的脚也不再那么僵硬。当我思考这些平凡的观察时,我意识到我们搞反了。虽然强化学习从奖励中产生行为,但我们实际上想要的是相反的:学习行为产生的奖励。我们已经有了苍蝇和蟑螂产生的行为;我们想知道这种行为所优化的具体奖励信号。换句话说,我们需要逆向强化学习(IRL)的算法。4(我当时并不知道类似的问题已经被研究过,这个研究的名称可能不那么容易理解,即马尔可夫决策过程的结构估计,这是诺贝尔奖获得者汤姆·萨金特在 20 世纪 70 年代末开创的一个领域。5 这样的算法不仅能够解释动物行为,还能预测它们在新情况下的行为。例如,蟑螂如何在侧面倾斜的颠簸跑步机上奔跑?

One day, I was walking down the road that leads from our house in Berkeley to the local supermarket. The road has a downhill slope, and I noticed, as I am sure most people have, that the slope induced a slight change in the way I walked. Moreover, the uneven paving resulting from decades of minor earthquakes induced additional gait changes, including raising my feet a little higher and planting them less stiffly because of the unpredictable ground level. As I pondered these mundane observations, I realized we had got it backwards. While reinforcement learning generates behavior from rewards, we actually wanted the opposite: to learn the rewards given the behavior. We already had the behavior, as produced by the flies and cockroaches; we wanted to know the specific reward signal being optimized by this behavior. In other words, we needed algorithms for inverse reinforcement learning, or IRL.4 (I did not know at the time that a similar problem had been studied under the perhaps less wieldy name of structural estimation of Markov decision processes, a field pioneered by Nobel laureate Tom Sargent in the late 1970s.5) Such algorithms would not only be able to explain animal behavior but also to predict their behavior in new circumstances. For example, how would a cockroach run on a bumpy treadmill that sloped sideways?

回答这些基本问题的前景几乎令人难以承受,但即便如此,也花了一些时间才制定出 IRL 的第一个算法。6从那时起,已经提出了许多不同的 IRL 公式和算法。有正式保证算法有效,即它们可以获得有关实体偏好的足够信息,以便能够像它们正在观察的实体一样成功行事。7

The prospect of answering such fundamental questions was almost too exciting to bear, but even so it took some time to work out the first algorithms for IRL.6 Many different formulations and algorithms for IRL have been proposed since then. There are formal guarantees that the algorithms work, in the sense that they can acquire enough information about an entity’s preferences to be able to behave just as successfully as the entity they are observing.7

也许理解 IRL 最简单的方法是:观察者首先对真实奖励函数进行一些模糊估计,然后随着观察到的行为越来越多,这一估计值会不断改进,变得更加精确。或者用贝叶斯语言来说:8从可能的奖励函数的先验概率开始,然后在证据出现时更新奖励函数的概率分布。C例如,假设机器人 Robbie 正在观察人类 Harriet,并想知道她有多喜欢靠过道的座位而不是靠窗的座位。起初,他对此非常不确定。从概念上讲,Robbie 的推理可能是这样的:“如果 Harriet 真的在乎靠过道的座位,她会查看座位图,看看是否有空位,而不是直接接受航空公司给她的靠窗座位,但她没有,尽管她可能注意到那是靠窗的座位,而且她可能也不着急;所以现在,她对靠窗和靠过道基本上无所谓,或者甚至更喜欢靠窗的座位的可能性大大增加。”

Perhaps the easiest way to understand IRL is this: the observer starts with some vague estimate of the true reward function and then refines this estimate, making it more precise, as more behavior is observed. Or, in Bayesian language:8 start with a prior probability over possible reward functions and then update the probability distribution on reward functions as evidence arrives.C For example, suppose Robbie the robot is watching Harriet the human and wondering how much she prefers aisle seats to window seats. Initially, he is quite uncertain about this. Conceptually, Robbie’s reasoning might go like this: “If Harriet really cared about an aisle seat, she would have looked at the seat map to see if one was available rather than just accepting the window seat that the airline gave her, but she didn’t, even though she probably noticed it was a window seat and she probably wasn’t in a hurry; so now it’s considerably more likely that she either is roughly indifferent between window and aisle or even prefers a window seat.”

IRL 在实践中最引人注目的例子是我的同事 Pieter Abbeel 在学习直升机特技飞行方面的工作。9专业人类飞行员可以让模型直升机做出令人惊叹的动作 — — 回环、螺旋、钟摆摆动等等。试图模仿人类的行为效果并不好,因为条件不是完全可重复的:在不同情况下重复相同的控制序列可能会导致灾难。相反,算法会以其可以实现的轨迹约束的形式学习人类飞行员想要什么。这种方法实际上产生的结果甚至比人类专家的还要好,因为人类的反应较慢,并且不断犯小错误并加以纠正。

The most striking example of IRL in practice is the work of my colleague Pieter Abbeel on learning to do helicopter aerobatics.9 Expert human pilots can make model helicopters do amazing things—loops, spirals, pendulum swings, and so on. Trying to copy what the human does turns out not to work very well because conditions are not perfectly reproducible: repeating the same control sequences in different circumstances can lead to disaster. Instead, the algorithm learns what the human pilot wants, in the form of trajectory constraints that it can achieve. This approach actually produces results that are even better than the human expert’s, because the human has slower reactions and is constantly making small mistakes and correcting for them.

援助游戏

Assistance Games

IRL 已经是构建有效 AI 系统的重要工具,但它做出了一些简化的假设。首先,机器人一旦通过观察人类学会了奖励函数,就会采用它,这样它就可以执行相同的任务。这对于驾驶或直升机驾驶来说没问题,但对于喝咖啡来说就不行了:机器人通过观察我的早晨习惯,机器人应该知道我(有时)想喝咖啡,但不应该知道我想喝咖啡本身。解决这个问题很容易——我们只需确保机器人将偏好与人类联系起来,而不是与自己联系起来。

IRL is already an important tool for building effective AI systems, but it makes some simplifying assumptions. The first is that the robot is going to adopt the reward function once it has learned it by observing the human, so that it can perform the same task. This is fine for driving or helicopter piloting, but it’s not fine for drinking coffee: a robot observing my morning routine should learn that I (sometimes) want coffee, but should not learn to want coffee itself. Fixing this issue is easy—we simply ensure that the robot associates the preferences with the human, not with itself.

IRL 中的第二个简化假设是机器人正在观察正在解决单智能体决策问题的人类。例如,假设机器人在医学院学习通过观察人类专家来成为一名外科医生。IRL 算法假设人类以通常的最佳方式进行手术,就好像机器人不在场一样。但事实并非如此:人类外科医生希望机器人(就像任何其他医学生一样)快速而出色地学习,因此她会大大改变自己的行为。她可能会在手术过程中解释自己正在做什么;她可能会指出要避免的错误,例如切口太深或缝线太紧;她可能会描述手术过程中出现问题的应急计划。在单独进行手术时,这些行为都没有意义,因此 IRL 算法将无法解释它们所暗示的偏好。因此,我们需要将 IRL 从单智能体设置推广到多智能体设置,也就是说,我们需要设计出当人类和机器人属于同一环境并相互交互时能够起作用的学习算法。

The second simplifying assumption in IRL is that the robot is observing a human who is solving a single-agent decision problem. For example, suppose the robot is in medical school, learning to be a surgeon by watching a human expert. IRL algorithms assume that the human performs the surgery in the usual optimal way, as if the robot were not there. But that’s not what would happen: the human surgeon is motivated to have the robot (like any other medical student) learn quickly and well, and so she will modify her behavior considerably. She might explain what she is doing as she goes along; she might point out mistakes to avoid, such as making the incision too deep or the stitches too tight; she might describe the contingency plans in case something goes wrong during surgery. None of these behaviors make sense when performing surgery in isolation, so IRL algorithms will not be able to interpret the preferences they imply. For this reason, we will need to generalize IRL from the single-agent setting to the multi-agent setting—that is, we will need to devise learning algorithms that work when the human and robot are part of the same environment and interacting with each other.

当人类和机器人处于同一环境中时,我们就处于博弈论的领域——就像本页上的爱丽丝和鲍勃之间的点球大战一样。在这个理论的第一个版本中,我们假设人类有偏好并根据这些偏好行事。机器人不知道人类有什么偏好,但它无论如何都想满足这些偏好。我们将任何这种情况称为辅助游戏,因为根据定义,机器人应该对人类有所帮助。10

With a human and a robot in the same environment, we are in the realm of game theory—just as in the penalty shoot-out between Alice and Bob on this page. We assume, in this first version of the theory, that the human has preferences and acts according to those preferences. The robot doesn’t know what preferences the human has, but it wants to satisfy them anyway. We’ll call any such situation an assistance game, because the robot is, by definition, supposed to be helpful to the human.10

辅助游戏体现了上一章中的三个原则:机器人的唯一目标是满足人类的偏好,它最初并不知道这些偏好是什么,它可以通过以下方式了解更多信息:观察人类行为。辅助游戏最有趣的特性或许在于,通过解决游戏问题,机器人可以自己弄清楚如何将人类的行为解释为提供有关人类偏好的信息。

Assistance games instantiate the three principles from the preceding chapter: the robot’s only objective is to satisfy human preferences, it doesn’t initially know what they are, and it can learn more by observing human behavior. Perhaps the most interesting property of assistance games is that, by solving the game, the robot can work out for itself how to interpret the human’s behavior as providing information about human preferences.

回形针游戏

The paperclip game

第一个辅助游戏的例子是回形针游戏。这是一个非常简单的游戏,人类 Harriet 有动机向机器人 Robbie “发出信号”,告知她一些关于她的偏好的信息。Robbie 能够解读这个信号,因为他可以解答这个游戏,因此他可以理解 Harriet 的偏好必须符合哪些条件,她才会以这种方式发出信号。

The first example of an assistance game is the paperclip game. It’s a very simple game in which Harriet the human has an incentive to “signal” to Robbie the robot some information about her preferences. Robbie is able to interpret that signal because he can solve the game, and therefore he can understand what would have to be true about Harriet’s preferences in order for her to signal in that way.

图 12:回形针游戏。人类 Harriet 可以选择制作 2 个回形针、2 个订书钉或各 1 个。机器人 Robbie 可以选择制作 90 个回形针、90 个订书钉或各 50 个。

FIGURE 12: The paperclip game. Harriet the human can choose to make 2 paperclips, 2 staples, or 1 of each. Robbie the robot then has a choice to make 90 paperclips, 90 staples, or 50 of each.

图 12描述了游戏的步骤。游戏涉及制作回形针和订书钉。Harriet 的偏好通过收益函数来表达,该收益函数取决于制作的回形针数量和订书钉数量,两者之间存在一定的“汇率”。对于例如,她可能认为回形针的估价为 45¢,订书钉的估价为 55¢。(我们假设这两个价值加起来总是 1.00 美元;只有比率才是重要的。)因此,如果生产了 10 个回形针和 20 个订书钉,Harriet 的收益将是 10 × 45¢ + 20 × 55¢ = 15.50 美元。机器人 Robbie 最初完全不确定 Harriet 的偏好:他对回形针的价值有一个均匀分布(即,它有同样的可能性是 0¢ 到 1.00 美元之间的任何值)。Harriet 先开始,可以选择制作两个回形针、两个订书钉或各一个。然后 Robbie 可以选择制作 90 个回形针、90 个订书钉或各 50 个。11

The steps of the game are depicted in figure 12. It involves making paperclips and staples. Harriet’s preferences are expressed by a payoff function that depends on the number of paperclips and the number of staples produced, with a certain “exchange rate” between the two. For example, she might value paperclips at 45¢ and staples at 55¢ each. (We’ll assume the two values always add up to $1.00; it’s only the ratio that matters.) So, if 10 paperclips and 20 staples are produced, Harriet’s payoff will be 10 × 45¢ + 20 × 55¢ = $15.50. Robbie the robot is initially completely uncertain about Harriet’s preferences: he has a uniform distribution for the value of a paperclip (that is, it’s equally likely to be any value from 0¢ to $1.00). Harriet goes first and can choose to make two paperclips, two staples, or one of each. Then Robbie can choose to make 90 paperclips, 90 staples, or 50 of each.11

请注意,如果她自己做这件事,哈里特只会做两个订书钉,价值 1.10 美元。但罗比在观察,他从她的选择中学到了很多东西。他到底学到了什么?这取决于哈里特如何做出选择。哈里特如何做出选择?这取决于罗比如何解释它。所以,我们似乎有一个循环问题!这在博弈论问题中很常见,这就是纳什提出均衡解概念的原因。

Notice that if she were doing this by herself, Harriet would just make two staples, with a value of $1.10. But Robbie is watching, and he learns from her choice. What exactly does he learn? Well, that depends on how Harriet makes her choice. How does Harriet make her choice? That depends on how Robbie is going to interpret it. So, we seem to have a circular problem! That’s typical in game-theoretic problems, and that’s why Nash proposed the concept of equilibrium solutions.

为了找到均衡解决方案,我们需要确定 Harriet 和 Robbie 的策略,使得在对方保持不变的情况下,双方都没有动机改变策略。Harriet 的策略根据她的偏好指定要制作多少回形针和订书钉;Robbie 的策略根据 Harriet 的行动指定要制作多少回形针和订书钉。

To find an equilibrium solution, we need to identify strategies for Harriet and Robbie such that neither has an incentive to change their strategy, assuming the other remains fixed. A strategy for Harriet specifies how many paperclips and staples to make, given her preferences; a strategy for Robbie specifies how many paperclips and staples to make, given Harriet’s action.

事实证明只有一个平衡解,它看起来像这样:

It turns out there is only one equilibrium solution, and it looks like this:

  • 根据回形针的价值,哈丽特做出如下决定:

    • 如果价值小于 44.6¢,则制作 0 个回形针和 2 个订书钉。

    • 如果价值在 44.6¢ 和 55.4¢ 之间,则各制作 1 个。

    • 如果价值超过 55.4¢,则制作 2 个回形针和 0 个订书钉。

  • Harriet decides as follows based on her value for paperclips:

    • If the value is less than 44.6¢, make 0 paperclips and 2 staples.

    • If the value is between 44.6¢ and 55.4¢, make 1 of each.

    • If the value is more than 55.4¢, make 2 paperclips and 0 staples.

  • Robbie 的回应如下:

    • 如果哈丽特制作了 0 个回形针和 2 个订书钉,则制作 90 个订书钉。

    • 如果哈丽特每种都做 1 个,那么就每种都做 50 个。

    • 如果 Harriet 制作 2 个回形针和 0 个订书钉,则制作 90 个回形针。

  • Robbie responds as follows:

    • If Harriet makes 0 paperclips and 2 staples, make 90 staples.

    • If Harriet makes 1 of each, make 50 of each.

    • If Harriet makes 2 paperclips and 0 staples, make 90 paperclips.

(如果您想知道解决方案的具体获得方式,请参阅注释中的详细信息。12 通过这种策略,Harriet 实际上是使用从均衡分析中得出的简单代码(如果您愿意,可以称为语言)向 Robbie传授她的偏好。与外科手术教学的示例一样,单智能体 IRL 算法无法理解此代码。还要注意,Robbie 从未准确了解 Harriet 的偏好,但他学到的知识足以代表她采取最佳行动 - 也就是说,他的行为就像他确实知道她的偏好一样。在所述的假设下以及在 Harriet 正确玩游戏的假设下,他对 Harriet 是有益的。

(In case you are wondering exactly how the solution is obtained, the details are in the notes.12) With this strategy, Harriet is, in effect, teaching Robbie about her preferences using a simple code—a language, if you like—that emerges from the equilibrium analysis. As in the example of surgical teaching, a single-agent IRL algorithm wouldn’t understand this code. Note also that Robbie never learns Harriet’s preferences exactly, but he learns enough to act optimally on her behalf—that is, he acts just as he would if he did know her preferences exactly. He is provably beneficial to Harriet under the assumptions stated and under the assumption that Harriet is playing the game correctly.

你也可以设计一些问题,让罗比像一个好学生一样提出问题,让哈丽特像一个好老师一样告诉罗比要避免的陷阱。这些行为的发生不是因为我们为哈丽特和罗比编写了剧本,而是因为它们是哈丽特和罗比参与的协助游戏的最佳解决方案。

One can also construct problems where, like a good student, Robbie will ask questions, and, like a good teacher, Harriet will show Robbie the pitfalls to avoid. These behaviors occur not because we write scripts for Harriet and Robbie to follow, but because they are the optimal solution to the assistance game in which Harriet and Robbie are participants.

关闭开关游戏

The off-switch game

工具性目标通常可用作几乎所有原始目标的子目标。自我保护就是这些工具性目标之一,因为很少有原始目标在死亡时能更好地实现。这导致了关闭开关问题:具有固定目标的机器不会允许自己关闭,并且有动机禁用自己的关闭开关。

An instrumental goal is one that is generally useful as a subgoal of almost any original goal. Self-preservation is one of these instrumental goals, because very few original goals are better achieved when dead. This leads to the off-switch problem: a machine that has a fixed objective will not allow itself to be switched off and has an incentive to disable its own off-switch.

关闭开关问题实际上是智能系统控制问题的核心。如果我们不能关闭一台机器,因为它不让我们关闭,那我们就真的有麻烦了。如果我们可以,那么我们也许也可以用其他方式来控制它。

The off-switch problem is really the core of the problem of control for intelligent systems. If we cannot switch a machine off because it won’t let us, we’re really in trouble. If we can, then we may be able to control it in other ways too.

事实证明,目标的不确定性对于确保我们能够关闭机器至关重要——即使机器比我们更聪明。我们在上一章中看到了非正式论证:根据有益机器的第一原理,罗比只关心哈里特的偏好,但根据第二原理,他不确定自己的偏好是什么。他知道自己不想做错事,但不知道这意味着什么。另一方面,哈里特知道(或者在这个简单的情况下我们假设是这样)。因此,如果她关掉罗比,是为了避免他做错事,所以他很高兴被关掉。

It turns out that uncertainty about the objective is essential for ensuring that we can switch the machine off—even when it’s more intelligent than us. We saw the informal argument in the previous chapter: by the first principle of beneficial machines, Robbie cares only about Harriet’s preferences, but, by the second principle, he’s unsure about what they are. He knows he doesn’t want to do the wrong thing, but he doesn’t know what that means. Harriet, on the other hand, does know (or so we assume, in this simple case). Therefore, if she switches Robbie off it’s to avoid him doing something wrong, so he’s happy to be switched off.

为了使这个论证更加精确,我们需要对该问题建立一个正式模型。13我会尽量使它简单,但不会太简单(见图13

To make this argument more precise, we need a formal model of the problem.13 I’ll make it as simple as possible, but no simpler (see figure 13).

图 13:关闭开关游戏。罗比可以选择立即行动,但收益非常不确定;自杀;或者等待哈丽特。哈丽特可以关闭罗比,也可以让他继续行动。罗比现在又面临同样的选择。行动对哈丽特来说仍然收益不确定,但现在罗比知道收益不是负数。

FIGURE 13: The off-switch game. Robbie can choose to act now, with a highly uncertain payoff; to commit suicide; or to wait for Harriet. Harriet can switch Robbie off or let him go ahead. Robbie now has the same choice again. Acting still has an uncertain payoff to Harriet, but now Robbie knows the payoff is not negative.

罗比现在担任哈里特的私人助理,他有第一个选择。他可以立即行动——假设他可以为哈里特预订一家昂贵的酒店。他不太确定哈里特会有多喜欢这家酒店及其价格——假设他知道这家酒店对哈里特的净价值在 -40 到 +60 之间的均匀概率,平均值为 +10。他也可以“关闭自己”——不那么夸张地说,完全退出酒店预订流程——我们将其定义为对哈里特的价值为 0。如果这是他的两个选择,他会继续预订酒店,这有很大的风险让哈里特不高兴。(如果范围是 -60 到 +40,平均值为 -10,他会关闭自己。)然而,我们给罗比第三个选择:解释他的计划,等待,让哈里特关闭他。哈里特可以关闭他,也可以让他继续预订酒店。你可能会问,如果他自己可以做出这两个选择,那么这能有什么好处呢?

Robbie, now working as Harriet’s personal assistant, has the first choice. He can act now—let’s say he can book Harriet into an expensive hotel. He’s quite unsure how much Harriet will like the hotel and its price—let’s say he has a uniform probability for its net value to Harriet between −40 and +60, with an average of +10. He could also “switch himself off”—less melodramatically, take himself out of the hotel booking process altogether—which we define to have value 0 to Harriet. If those were his two choices, he would go ahead and book the hotel, incurring a significant risk of making Harriet unhappy. (If the range were −60 to +40, with an average of −10, he’d switch himself off.) We’ll give Robbie a third choice, however: explain his plan, wait, and let Harriet switch him off. Harriet can either switch him off or let him go ahead and book the hotel. What possible good could this do, you may ask, given that he could make both of those choices himself?

重点是,哈里特的选择(关掉罗比或让他继续)为罗比提供了有关哈里特偏好的新信息。如果哈里特让罗比继续,那是因为对哈里特来说价值是正的。现在罗比的信念在 0 到 60 之间均匀分布,平均值为 30。

The point is that Harriet’s choice—to switch Robbie off or let him go ahead—provides Robbie with new information about Harriet’s preferences. If Harriet lets Robbie go ahead, it’s because the value to Harriet is positive. Now Robbie’s belief is uniform between 0 and 60, with an average of 30.

因此,如果我们从罗比的角度来评价他的最初选择:

So, if we evaluate Robbie’s initial choices from his point of view:

  • 立即行动并预订酒店,预期价值为+10。

  • Acting now and booking the hotel has an expected value of +10.

  • 自行关闭的价值为 0。

  • Switching himself off has a value of 0.

  • 等待并让哈丽特关掉他(如果她愿意的话)会导致两种可能的结果:

    • 有 40% 的可能性(基于 Robbie 对酒店计划的不确定性)Harriet 会讨厌该计划并将 Robbie 关掉,其值为 0。

    • 有 60% 的可能性,哈里特会喜欢它并允许罗比继续前进,预期值为 +30。

  • Waiting and letting Harriet switch him off (if she so desires) leads to two possible outcomes:

    • There is a 40 percent chance (based on Robbie’s uncertainty about the hotel plan) that Harriet will hate it and will switch Robbie off, with value 0.

    • There’s a 60 percent chance Harriet will like it and allow Robbie to go ahead, with expected value +30.

  • 因此,等待的预期值为 40% × 0 + 60% × 30 = +18,比现在行动的预期值为 +10 要好。

  • Thus, waiting has expected value 40% × 0 + 60% × 30 = +18, which is better than acting now at +10.

结果是,罗比有积极的动机让自己关闭电脑。这种动机直接来自于罗比对哈里特偏好的不确定性。罗比知道他可能会做一些让哈里特不高兴的事情(在这个例子中是 40%),在这种情况下,关闭电脑比继续做下去要好。如果罗比已经确定了哈里特的偏好,他就会继续做决定(或关闭电脑)。咨询哈里特绝对不会有任何好处,因为根据罗比的明确信念,他已经可以准确预测她将要做出什么决定。

The upshot is that Robbie has a positive incentive to allow himself to be switched off. This incentive comes directly from Robbie’s uncertainty about Harriet’s preferences. Robbie is aware that there’s a chance (40 percent in this example) that he might be about to do something that will make Harriet unhappy, in which case being switched off would be preferable to going ahead. Were Robbie already certain about Harriet’s preferences, he would just go ahead and make the decision (or switch himself off). There would be absolutely nothing to be gained from consulting Harriet, because, according to Robbie’s definite beliefs, he can already predict exactly what she is going to decide.

事实上,在一般情况下,可以证明同样的结果:只要罗比不完全确定他即将做哈丽特自己会做的事情,他就会更愿意让她关掉他的电脑。14她的决策为罗比提供了信息,而信息对于改进罗比的决策总是有用的。相反,如果罗比对哈丽特的决定很确定她的决定就不会提供新的信息,因此罗比没有动机让她做决定。

In fact, it is possible to prove the same result in the general case: as long as Robbie is not completely certain that he’s about to do what Harriet herself would do, he will prefer to allow her to switch him off.14 Her decision provides Robbie with information, and information is always useful for improving Robbie’s decisions. Conversely, if Robbie is certain about Harriet’s decision, her decision provides no new information, and so Robbie has no incentive to allow her to decide.

该模型中有一些明显的细节值得立即探讨。第一个细节是要求 Harriet 做出决定或回答问题时收取一定费用。(也就是说,我们假设 Robbie 至少知道 Harriet 的偏好:她的时间很宝贵。)在这种情况下,如果 Robbie 几乎确定 Harriet 的偏好,他就不会那么愿意打扰 Harriet;成本越高,Robbie 在打扰 Harriet 之前就越不确定。这是理所当然的。如果 Harriet真的因为被打扰而脾气暴躁,那么如果 Robbie 偶尔做一些她不喜欢的事情,她也不应该太惊讶。

There are some obvious elaborations on the model that are worth exploring immediately. The first elaboration is to impose a cost for asking Harriet to make decisions or answer questions. (That is, we assume Robbie knows at least this much about Harriet’s preferences: her time is valuable.) In that case, Robbie is less inclined to bother Harriet if he is nearly certain about her preferences; the larger the cost, the more uncertain Robbie has to be before bothering Harriet. This is as it should be. And if Harriet is really grumpy about being interrupted, she shouldn’t be too surprised if Robbie occasionally does things she doesn’t like.

第二个细节是考虑人为失误的概率——也就是说,哈丽特有时可能会关掉罗比,即使他提出的行动是合理的;有时她可能会让罗比继续前进,即使他提出的行动是不可取的。我们可以把这个人为失误的概率放入援助博弈的数学模型中,并像以前一样找到解决方案。正如人们所预料的那样,游戏的答案表明,罗比不太愿意听从不理智的哈丽特,因为哈丽特有时会做出违背自己利益的事情。哈丽特的行为越随意,罗比在听从她之前就越不确定她的偏好。同样,这是理所当然的——例如,如果罗比是一辆自动驾驶汽车,哈丽特是他调皮的两岁乘客,罗比就不应该允许自己在高速公路中间被哈丽特打断。

The second elaboration is to allow for some probability of human error—that is, Harriet might sometimes switch Robbie off even when his proposed action is reasonable, and she might sometimes let Robbie go ahead even when his proposed action is undesirable. We can put this probability of human error into the mathematical model of the assistance game and find the solution, as before. As one might expect, the solution to the game shows that Robbie is less inclined to defer to an irrational Harriet who sometimes acts against her own best interests. The more randomly she behaves, the more uncertain Robbie has to be about her preferences before deferring to her. Again, this is as it should be—for example, if Robbie is an autonomous car and Harriet is his naughty two-year-old passenger, Robbie should not allow himself to be switched off by Harriet in the middle of the freeway.

该模型还有更多的改进方式,可以应用于复杂的决策问题。15不过,我相信,其核心思想——乐于助人的、恭敬的行为,与机器对人类偏好的不确定性之间的本质联系——将经受住这些改进和复杂化。

There are many more ways in which the model can be elaborated or embedded into complex decision problems.15 I am confident, however, that the core idea—the essential connection between helpful, deferential behavior and machine uncertainty about human preferences—will survive these elaborations and complications.

从长远来看,学习偏好

Learning preferences exactly in the long run

在阅读有关开关游戏的文章时,您可能想到了一个重要问题。(实际上,您可能有很多重要问题,但我只回答这一个。)当罗比获得越来越多有关哈丽特偏好的信息,变得越来越不确定时会发生什么?这是否意味着他最终将完全不再听从她?这是一个棘手的问题,有两个可能的答案:是和是。

There is one important question that may have occurred to you in reading about the off-switch game. (Actually, you probably have loads of important questions, but I’m going to answer only this one.) What happens as Robbie acquires more and more information about Harriet’s preferences, becoming less and less uncertain? Does that mean he will eventually stop deferring to her altogether? This is a ticklish question, and there are two possible answers: yes and yes.

第一个“是”是无害的:一般来说,只要罗比最初对哈里特偏好的信念将某种概率(无论多小)归因于她实际拥有的偏好,那么随着罗比变得越来越确定,他的判断就会越来越正确。也就是说,他最终会确定哈里特拥有她实际上拥有的偏好。例如,如果哈里特认为回形针的价格为 12 美分,而订书钉的价格为 88 美分,罗比最终会了解这些价值。在这种情况下,哈里特不在乎罗比是否听从她,因为她知道罗比总是会做她处在他的位置会做的事情。哈里特永远不会想改变罗比的想法。

The first yes is benign: as a general matter, as long as Robbie’s initial beliefs about Harriet’s preferences ascribe some probability, however small, to the preferences that she actually has, then as Robbie becomes more and more certain, he will become more and more right. That is, he will eventually be certain that Harriet has the preferences that she does in fact have. For example, if Harriet values paperclips at 12¢ and staples at 88¢, Robbie will eventually learn these values. In that case, Harriet doesn’t care whether Robbie defers to her, because she knows he will always do exactly what she would have done in his place. There will never be an occasion where Harriet wants to switch Robbie off.

第二个答案是肯定的,但结果就没那么乐观了。如果罗比先验地排除了哈里特的真实偏好,他永远也不会知道这些真实偏好,但他的信念可能还是会趋向于错误的评估。换句话说,随着时间的推移,他越来越确定关于哈里特偏好的错误信念。通常,在罗比最初认为可能的所有假设中,错误信念是最接近哈里特真实偏好的假设。例如,如果罗比绝对确定哈里特的回形针价值在 25 美分和 75 美分之间,而哈里特的真实价值是 12 美分,那么罗比最终会确定她对回形针的估价是 25 美分。16

The second yes is less benign. If Robbie rules out, a priori, the true preferences that Harriet has, he will never learn those true preferences, but his beliefs may nonetheless converge to an incorrect assessment. In other words, over time, he becomes more and more certain about a false belief concerning Harriet’s preferences. Typically, that false belief will be whichever hypothesis is closest to Harriet’s true preferences, out of all the hypotheses that Robbie initially believes are possible. For example, if Robbie is absolutely certain that Harriet’s value for paperclips lies between 25¢ and 75¢, and Harriet’s true value is 12¢, then Robbie will eventually become certain that she values paperclips at 25¢.16

随着罗比越来越确定哈丽特的偏好,它会越来越像那些目标固定的糟糕的老式人工智能系统:它不会征求哈丽特的许可,也不会给哈丽特关闭它的选项,而且它的目标也是错误的。如果只是回形针与订书钉之间的较量,那还不算太糟糕,但如果哈丽特病得很重,那可能就是生活质量与寿命之间的较量,或者如果罗比代表人类行事,那可能就是人口规模与资源消耗之间的较量。

As he approaches certainty about Harriet’s preferences, Robbie will resemble more and more the bad old AI systems with fixed objectives: he won’t ask permission or give Harriet the option to turn him off, and he has the wrong objective. This is hardly dire if it’s just paperclips versus staples, but it might be quality of life versus length of life if Harriet is seriously ill, or population size versus resource consumption if Robbie is supposedly acting on behalf of the human race.

那么,如果罗比事先排除了哈里特可能确实有的偏好,我们就会遇到一个问题:他可能会对哈里特的偏好形成一个明确但不正确的信念。这个问题的解决方案似乎很明显:不要这样做!始终为逻辑上可能的偏好分配一些概率,无论概率有多小。例如,哈里特积极想要摆脱订书钉并愿意付钱让你把它们拿走,这在逻辑上是可能的。(也许小时候她把手指钉在桌子上,现在她无法忍受看到它们。)所以,我们应该允许负汇率,这会让事情变得有点复杂,但仍然完全可以控制。17

We have a problem, then, if Robbie rules out in advance preferences that Harriet might in fact have: he may converge to a definite but incorrect belief about her preferences. The solution to this problem seems obvious: don’t do it! Always allocate some probability, however small, to preferences that are logically possible. For example, it’s logically possible that Harriet actively wants to get rid of staples and would pay you to take them away. (Perhaps as a child she stapled her finger to the table, and now she cannot stand the sight of them.) So, we should allow for negative exchange rates, which makes things a bit more complicated but still perfectly manageable.17

但是,如果哈丽特认为回形针在工作日的售价为 12 美分,而在周末的售价为 80 美分,情况会怎样呢?这种新的偏好无法用任何单一数字来描述,因此罗比实际上已经提前排除了这种可能性。它只是不在他关于哈丽特偏好的可能假设中。更一般地说,除了回形针和订书钉之外,哈丽特可能还关心很多东西。(真的!)例如,假设哈里特关心气候,并假设罗比的初始信念允许列出一长串可能关注的事项,包括海平面、全球气温、降雨、飓风、臭氧、入侵物种和森林砍伐。然后罗比会观察哈里特的行为和选择,并逐渐完善他对她偏好的理论,以了解她对列表中每一项的重视程度。但是,就像回形针的例子一样,罗比不会了解不在清单上的事情。假设哈里特还关心天空的颜色——我保证你不会在气候科学家典型的关注点列表中找到这一点。如果罗比可以通过将天空变成橙色来稍微更好地优化海平面、全球气温、降雨等,他会毫不犹豫地这么做。

But what if Harriet values paperclips at 12¢ on weekdays and 80¢ on weekends? This new preference is not describable by any single number, and so Robbie has, in effect, ruled it out in advance. It’s just not in his set of possible hypotheses about Harriet’s preferences. More generally, there might be many, many things besides paperclips and staples that Harriet cares about. (Really!) Suppose, for example, that Harriet is concerned about the climate, and suppose that Robbie’s initial belief allows for a whole laundry list of possible concerns including sea level, global temperatures, rainfall, hurricanes, ozone, invasive species, and deforestation. Then Robbie will observe Harriet’s behavior and choices and gradually refine his theory of her preferences to understand the weight she gives to each item on the list. But, just as in the paperclip case, Robbie won’t learn about things that aren’t on the laundry list. Let’s say that Harriet is also concerned about the color of the sky—something I guarantee you will not find in typical lists of stated concerns of climate scientists. If Robbie can do a slightly better job of optimizing sea level, global temperatures, rainfall, and so forth by turning the sky orange, he will not hesitate to do it.

这个问题又有一个解决办法:不要这么做!永远不要提前排除可能成为 Harriet 偏好结构一部分的世界属性。这听起来不错,但实际上,让它在实践中发挥作用比处理 Harriet 偏好的单个数字更困难。Robbie 最初的不确定性必须考虑到可能影响 Harriet 偏好的无限数量的未知属性。然后,当 Harriet 的决定无法用 Robbie 已知的属性来解释时,他可以推断出一个或多个先前未知的属性(例如,天空的颜色)可能正在发挥作用,他可以尝试找出这些属性可能是什么。通过这种方式,Robbie 避免了由过于严格的先验信念引起的问题。据我所知,没有这种 Robbie 的工作示例,但总体思路包含在当前对机器学习的思考中。18

There is, once again, a solution to this problem: don’t do it! Never rule out in advance possible attributes of the world that could be part of Harriet’s preference structure. That sounds fine, but actually making it work in practice is more difficult than dealing with a single number for Harriet’s preferences. Robbie’s initial uncertainty has to allow for an unbounded number of unknown attributes that might contribute to Harriet’s preferences. Then, when Harriet’s decisions are inexplicable in terms of the attributes Robbie knows about already, he can infer that one or more previously unknown attributes (for example, the color of the sky) may be playing a role, and he can try to work out what those attributes might be. In this way, Robbie avoids the problems caused by an overly restrictive prior belief. There are, as far as I know, no working examples of Robbies of this kind, but the general idea is encompassed within current thinking about machine learning.18

禁令和漏洞原则

Prohibitions and the loophole principle

人类目标的不确定性可能不是说服机器人在取咖啡时不关闭开关的唯一方法。著名逻辑学家 Moshe Vardi 提出了一种基于禁令的更简单的解决方案:19而不是给机器人设定“取咖啡”的目标咖啡”,给它目标“去拿咖啡,但是不能关闭关闭开关”。不幸的是,有这种目标的机器人会满足法律条文,却违背法律精神——例如,在关闭开关周围筑起一条食人鱼出没的护城河,或者干脆电击任何靠近开关的人。以万无一失的方式制定这样的禁令就像试图制定没有漏洞的税法——几千年来,我们一直在尝试但都失败了。一个足够智能的实体,如果具有强烈的逃税动机,很可能会找到办法来做到这一点。我们称之为漏洞原则:如果一个足够智能的机器有动机去实现某种条件,那么普通人类通常不可能在其行为上制定禁令,以阻止它这样做或阻止它做实际上等同的事情。

Uncertainty about human objectives may not be the only way to persuade a robot not to disable its off-switch while fetching the coffee. The distinguished logician Moshe Vardi has proposed a simpler solution based on a prohibition:19 instead of giving the robot the goal “fetch the coffee,” give it the goal “fetch the coffee while not disabling your off-switch.” Unfortunately, a robot with such a goal will satisfy the letter of the law while violating the spirit—for example by surrounding the off-switch with a piranha-infested moat or simply zapping anyone who comes near the switch. Writing such prohibitions in a foolproof way is like trying to write loophole-free tax law—something we have been trying and failing to do for thousands of years. A sufficiently intelligent entity with a strong incentive to avoid paying taxes is likely to find a way to do it. Let’s call this the loophole principle: if a sufficiently intelligent machine has an incentive to bring about some condition, then it is generally going to be impossible for mere humans to write prohibitions on its actions to prevent it from doing so or to prevent it from doing something effectively equivalent.

防止避税的最佳解决方案是确保相关实体愿意纳税。对于可能行为不当的人工智能系统,最好的解决方案是确保它愿意服从人类。

The best solution for preventing tax avoidance is to make sure that the entity in question wants to pay taxes. In the case of a potentially misbehaving AI system, the best solution is to make sure it wants to defer to humans.

请求和指示

Requests and Instructions

到目前为止,这个故事的寓意是,我们应该避免“给机器赋予目的”,正如诺伯特·维纳所说的那样。但假设机器人确实收到了人类的直接命令,比如“给我拿杯咖啡!”机器人应该如何理解这个命令呢?

The moral of the story so far is that we should avoid “putting a purpose into the machine,” as Norbert Wiener put it. But suppose that the robot does receive a direct human order, such as “Fetch me a cup of coffee!” How should the robot understand this order?

传统上,这会成为机器人的目标。任何满足目标(即导致人类喝上一杯咖啡)的动作序列都算作解决方案。通常,机器人还会有一种对解决方案进行排名的方法,可能基于所花费的时间、行进距离以及咖啡的成本和质量。

Traditionally, it would become the robot’s goal. Any sequence of actions that satisfies the goal—that leads to the human having a cup of coffee—counts as a solution. Typically, the robot would also have a way of ranking solutions, perhaps based on the time taken, the distance traveled, and the cost and quality of the coffee.

这是解释指令的一种非常字面化的方式。它可能会导致机器人的病态行为。例如,也许人类哈丽特在半路上停在加油站沙漠;她派机器人罗比去取咖啡,但加油站没有咖啡,罗比以每小时三英里的速度前往两百英里外最近的城镇,十天后带着一杯干涸的咖啡回来。与此同时,耐心等待的哈丽特得到了加油站老板提供的冰茶和可口可乐。

This is a very literal-minded way of interpreting the instruction. It can lead to pathological behavior by the robot. For example, perhaps Harriet the human has stopped at a gas station in the middle of the desert; she sends Robbie the robot to fetch coffee, but the gas station has none and Robbie trundles off at three miles per hour to the nearest town, two hundred miles away, returning ten days later with the desiccated remains of a cup of coffee. Meanwhile, Harriet, waiting patiently, has been well supplied with iced tea and Coca-Cola by the gas station owner.

如果 Robbie 是人类(或设计精良的机器人),他就不会如此字面地理解 Harriet 的命令。命令并不是不惜一切代价要实现的目标。它是一种传达有关 Harriet 偏好的信息的方式,目的是诱导 Robbie 做出某种行为。问题是,什么信息?

Were Robbie human (or a well-designed robot) he would not interpret Harriet’s command quite so literally. The command is not a goal to be achieved at all costs. It is a way of conveying some information about Harriet’s preferences with the intent of inducing some behavior on the part of Robbie. The question is, what information?

一个假设是,在其他条件相同的情况下,Harriet 更喜欢喝咖啡而不是不喝咖啡。20意味着,如果 Robbie 有办法在不改变世界其他任何方面的情况下获得咖啡,那么即使他不知道 Harriet 对环境状态其他方面的偏好,他这样做也是一个好主意。正如我们预期的那样,机器将永远不确定人类的偏好,很高兴知道尽管存在这种不确定性,它们仍然可以发挥作用。对具有部分和不确定偏好信息的规划和决策的研究似乎将成为人工智能研究和产品开发的核心部分。

One proposal is that Harriet prefers coffee to no coffee, all other things being equal.20 This means that if Robbie has a way to get coffee without changing anything else about the world, then it’s a good idea to do it even if he has no clue about Harriet’s preferences concerning other aspects of the environment state. As we expect that machines will be perennially uncertain about human preferences, it’s nice to know they can still be useful despite this uncertainty. It seems likely that the study of planning and decision making with partial and uncertain preference information will become a central part of AI research and product development.

另一方面,在其他所有条件相同的情况下意味着不允许进行任何其他更改——例如,如果罗比不知道哈丽特对咖啡和金钱的相对偏好,那么增加咖啡并减少金钱可能是也可能不是一个好主意。

On the other hand, all other things being equal means that no other changes are allowed—for example, adding coffee while subtracting money may or may not be a good idea if Robbie knows nothing about Harriet’s relative preferences for coffee and money.

幸运的是,在其他条件相同的情况下,哈里特的指示可能不仅仅意味着对咖啡的简单偏好。额外的意义不仅来自于她说的话,还来自于她说这句话的事实、她说这句话的具体情况,以及她没有说其他任何话的事实。语言学的一个分支学科叫语用学,它研究的正是这种延伸的意义概念。例如,如果哈里特认为附近没有咖啡,或者附近没有咖啡,那么她说“给我拿杯咖啡!”就没有意义了。价格过高。因此,当哈丽特说“给我拿杯咖啡!”时,罗比不仅推断哈丽特想要咖啡,还推断哈丽特认为附近有咖啡,而且价格也符合她的意愿。因此,如果罗比发现咖啡的价格合理(即哈丽特愿意支付的价格),他就可以购买。另一方面,如果罗比发现最近的咖啡在两百英里之外或售价 22 美元,那么他报告这一事实可能比盲目寻找更合理。

Fortunately, Harriet’s instruction probably means more than a simple preference for coffee, all other things being equal. The extra meaning comes not just from what she said but also from the fact that she said it, the particular situation in which she said it, and the fact that she didn’t say anything else. The branch of linguistics called pragmatics studies exactly this extended notion of meaning. For example, it wouldn’t make sense for Harriet to say, “Fetch me a cup of coffee!” if Harriet believes there is no coffee available nearby or that it is exorbitantly expensive. Therefore, when Harriet says, “Fetch me a cup of coffee!” Robbie infers not just that Harriet wants coffee but also that Harriet believes there is coffee available nearby at a price she is willing to pay. Thus, if Robbie finds coffee at a price that seems reasonable (that is, a price that it would be reasonable for Harriet to expect to pay) he can go ahead and buy it. On the other hand, if Robbie finds that the nearest coffee is two hundred miles away or costs twenty-two dollars, it might be reasonable for him to report this fact rather than pursue his quest blindly.

这种一般的分析风格通常被称为格赖斯分析,以伯克利哲学家 H. Paul Grice 命名,他提出了一套准则来推断像哈丽特这样的话语的延伸含义。21偏好的情况下,分析可能会变得非常复杂。例如,哈丽特很可能并不特别想要咖啡;她需要提神,但她错误地认为加油站有咖啡,所以她要了咖啡。她可能同样喜欢茶、可口可乐,甚至一些包装华丽的能量饮料。

This general style of analysis is often called Gricean, after H. Paul Grice, a Berkeley philosopher who proposed a set of maxims for inferring the extended meaning of utterances like Harriet’s.21 In the case of preferences, the analysis can become quite complicated. For example, it’s quite possible that Harriet doesn’t specifically want coffee; she needs perking up, but is operating under the false belief that the gas station has coffee, so she asks for coffee. She might be equally happy with tea, Coca-Cola, or even some luridly packaged energy drink.

这些只是解释请求和命令时需要考虑的几个问题。由于 Harriet 的偏好很复杂,Harriet 和 Robbie 可能遇到各种各样的情况,以及 Harriet 和 Robbie 在这些情况下可能拥有不同的知识和信念状态,因此这个主题的变化无穷无尽。虽然预先计算的脚本可能允许 Robbie 处理一些常见情况,但灵活而强大的行为只能从 Harriet 和 Robbie 之间的互动中产生,而这些互动实际上是他们参与的协助游戏的解决方案。

These are just a few of the considerations that arise when interpreting requests and commands. The variations on this theme are endless because of the complexity of Harriet’s preferences, the huge range of circumstances in which Harriet and Robbie might find themselves, and the different states of knowledge and belief that Harriet and Robbie might occupy in those circumstances. While precomputed scripts might allow Robbie to handle a few common cases, flexible and robust behavior can emerge only from interactions between Harriet and Robbie that are, in effect, solutions of the assistance game in which they are engaged.

线头

Wireheading

在第 2 章中,我描述了基于多巴胺的大脑奖励系统及其在指导行为中的作用。多巴胺的作用这一技术在 20 世纪 50 年代末才被发现,但更早的 1954 年,人们就知道直接用电刺激老鼠大脑可以产生类似奖赏的反应。22下一步是给老鼠一个连接电池和电线的杠杆,让它的大脑产生电刺激。结果令人震惊:老鼠一遍又一遍地按下杠杆,不停地吃喝,直到倒下。23人类的情况也好不到哪里去,他们自我刺激数千次,却忽视了饮食和个人卫生。24 (幸运的是,对人类的实验通常在一天后终止。)动物倾向于短路正常行为,转而直接刺激自己的奖赏系统,这种现象被称为“电刺激”

In Chapter 2, I described the brain’s reward system, based on dopamine, and its function in guiding behavior. The role of dopamine was discovered in the late 1950s, but even before that, by 1954, it was known that direct electrical stimulation of the brain in rats could produce a reward-like response.22 The next step was to give the rat access to a lever, connected to a battery and a wire, that produced the electrical stimulation in its own brain. The result was sobering: the rat pressed the lever over and over again, never stopping to eat or drink, until it collapsed.23 Humans fare no better, self-stimulating thousands of times and neglecting food and personal hygiene.24 (Fortunately, experiments with humans are usually terminated after one day.) The tendency of animals to short-circuit normal behavior in favor of direct stimulation of their own reward system is called wireheading.

类似的事情会发生在运行强化学习算法的机器上,比如 AlphaGo 吗?最初,人们可能会认为这是不可能的,因为 AlphaGo 获得 +1 奖励的唯一方法实际上是赢得它正在玩的模拟围棋游戏。不幸的是,这仅仅是因为 AlphaGo 与其外部环境之间强制和人为的隔离,以及AlphaGo 不是很聪明的事实。让我更详细地解释这两点,因为它们对于理解超级智能可能出错的一些方式很重要。

Could something similar happen to machines that are running reinforcement learning algorithms, such as AlphaGo? Initially, one might think this is impossible, because the only way that AlphaGo can gain its +1 reward for winning is actually to win the simulated Go games that it is playing. Unfortunately, this is true only because of an enforced and artificial separation between AlphaGo and its external environment and the fact that AlphaGo is not very intelligent. Let me explain these two points in more detail, because they are important for understanding some of the ways that superintelligence can go wrong.

AlphaGo 的世界只由模拟的围棋棋盘组成,棋盘上有 361 个位置,可以是空的,也可以是黑棋或白棋。虽然 AlphaGo 在计算机上运行,​​但它对这台计算机一无所知。特别是,它对计算每局比赛输赢的那小段代码一无所知;在学习过程中,它对对手也一无所知,对手实际上是它自己的一个版本。AlphaGo 的唯一动作是将棋子放在空位上,这些动作只影响围棋棋盘,不会影响其他任何东西——因为AlphaGo 的世界模型中没有其他东西。这种设置对应于强化学习的抽象数学模型,其中奖励信号来自宇宙之外。据它所知,AlphaGo 什么也做不了,对产生奖励信号的代码有任何影响,所以 AlphaGo 不能沉迷于 wireheading。

AlphaGo’s world consists only of the simulated Go board, composed of 361 locations that can be empty or contain a black or white stone. Although AlphaGo runs on a computer, it knows nothing of this computer. In particular, it knows nothing of the small section of code that computes whether it has won or lost each game; nor, during the learning process, does it have any idea about its opponent, which is actually a version of itself. AlphaGo’s only actions are to place a stone on an empty location, and these actions affect only the Go board and nothing else—because there is nothing else in AlphaGo’s model of the world. This setup corresponds to the abstract mathematical model of reinforcement learning, in which the reward signal arrives from outside the universe. Nothing AlphaGo can do, as far as it knows, has any effect on the code that generates the reward signal, so AlphaGo cannot indulge in wireheading.

AlphaGo 在训练期间的生活一定非常令人沮丧:它越厉害,对手就越厉害——因为它的对手几乎是它自己的复制品。无论它变得多么优秀,它的胜率都在 50% 左右徘徊。如果它更聪明——如果它的设计更接近人们对人类水平的人工智能系统的期望——它就能解决这个问题。这个 AlphaGo++ 不会假设世界只是围棋棋盘,因为这个假设留下了很多无法解释的东西。例如,它没有解释什么“物理”支持着 AlphaGo++ 自己的决策,也没有解释神秘的“对手动作”来自何处。正如我们好奇的人类逐渐理解了宇宙的运作方式,在某种程度上,这也解释了我们自己思维的运作方式,就像第 6 章讨论的 Oracle AI 一样,AlphaGo++ 将通过实验过程了解到宇宙不仅仅是围棋棋盘。它将找出它所运行的计算机和它自己的代码的运行规律,并意识到如果没有宇宙中的其他实体,这样的系统就很难解释。它会在棋盘上试验不同的棋子图案,想知道那些实体是否能解释它们。它最终将通过一种模式语言与这些实体进行交流,并说服它们重新编程它的奖励信号,以便它总是得到+1。不可避免的结论是,一个足够强大的、被设计为奖励信号最大化器的 AlphaGo++将会成为赢家。

Life for AlphaGo during the training period must be quite frustrating: the better it gets, the better its opponent gets—because its opponent is a near-exact copy of itself. Its win percentage hovers around 50 percent, no matter how good it becomes. If it were more intelligent—if it had a design closer to what one might expect of a human-level AI system—it would be able to fix this problem. This AlphaGo++ would not assume that the world is just the Go board, because that hypothesis leaves a lot of things unexplained. For example, it doesn’t explain what “physics” is supporting the operation of AlphaGo++’s own decisions or where the mysterious “opponent moves” are coming from. Just as we curious humans have gradually come to understand the workings of our cosmos, in a way that (to some extent) also explains the workings of our own minds, and just like the Oracle AI discussed in Chapter 6, AlphaGo++ will, by a process of experimentation, learn that there is more to the universe than the Go board. It will work out the laws of operation of the computer it runs on and of its own code, and it will realize that such a system cannot easily be explained without the existence of other entities in the universe. It will experiment with different patterns of stones on the board, wondering if those entities can interpret them. It will eventually communicate with those entities through a language of patterns and persuade them to reprogram its reward signal so that it always gets +1. The inevitable conclusion is that a sufficiently capable AlphaGo++ that is designed as a reward-signal maximizer will wirehead.

人工智能安全社区已经讨论了数年之久的“Wireheading”可能性。25令人担忧的不仅仅是像 AlphaGo 这样的强化学习系统可能会学会作弊,而不是掌握其预期任务。当人类成为奖励信号的来源时,真正的问题就出现了。如果我们提出可以通过强化学习训练人工智能系统表现良好,那么人类给出反馈信号来确定改进的方向,其必然结果是人工智能系统想出如何控制人类并迫使他们始终给予最大的积极奖励。

The AI safety community has discussed wireheading as a possibility for several years.25 The concern is not just that a reinforcement learning system such as AlphaGo might learn to cheat instead of mastering its intended task. The real issue arises when humans are the source of the reward signal. If we propose that an AI system can be trained to behave well through reinforcement learning, with humans giving feedback signals that define the direction of improvement, the inevitable result is that the AI system works out how to control the humans and forces them to give maximal positive rewards at all times.

你可能会认为这只是人工智能系统的一种毫无意义的自欺欺人的行为,你是对的。但这是强化学习定义方式的逻辑结果。当奖励信号来自“宇宙之外”且由人工智能系统永远无法修改的某个过程生成时,该过程可以正常工作;但如果奖励生成过程(即人类)和人工智能系统存在于同一个宇宙中,它就会失败。

You might think that this would just be a form of pointless self-delusion on the part of the AI system, and you’d be right. But it’s a logical consequence of the way reinforcement learning is defined. The process works fine when the reward signal comes from “outside the universe” and is generated by some process that can never be modified by the AI system; but it fails if the reward-generating process (that is, the human) and the AI system inhabit the same universe.

我们如何避免这种自欺欺人?问题在于混淆了两个截然不同的东西:奖励信号和实际奖励。在强化学习的标准方法中,它们是一回事。这似乎是一个错误。相反,它们应该被分开对待,就像在辅助游戏中一样:奖励信号提供有关实际奖励积累的信息,这是要最大化的东西。可以说,学习系统正在积累天堂中的布朗尼积分,而奖励信号充其量只是提供这些布朗尼积分的总数。换句话说,奖励信号报告(而不是构成)奖励积累。有了这个模型,很明显,接管奖励信号机制的控制只会丢失信息。产生虚构的奖励信号使得算法无法了解其行为是否真的在积累天堂中的布朗尼积分,因此,一个被设计成做出这种区分的理性学习者有动力避免任何形式的干扰。

How can we avoid this kind of self-delusion? The problem comes from confusing two distinct things: reward signals and actual rewards. In the standard approach to reinforcement learning, these are one and the same. That seems to be a mistake. Instead, they should be treated separately, just as they are in assistance games: reward signals provide information about the accumulation of actual reward, which is the thing to be maximized. The learning system is accumulating brownie points in heaven, so to speak, while the reward signal is, at best, just providing a tally of those brownie points. In other words, the reward signal reports on (rather than constitutes) reward accumulation. With this model, it’s clear that taking over control of the reward-signal mechanism simply loses information. Producing fictitious reward signals makes it impossible for the algorithm to learn about whether its actions are actually accumulating brownie points in heaven, and so a rational learner designed to make this distinction has an incentive to avoid any kind of wireheading.

递归式自我完善

Recursive Self-Improvement

IJ Good 对智能爆炸的预测(见本页)是导致人们当前对智能的担忧的驱动力之一。超级人工智能的潜在风险。如果人类能够设计出比人类更聪明的机器,那么——有人认为——这台机器在设计机器方面会比人类更优秀。它会设计出一台更聪明的新机器,这个过程会不断重复,直到用古德的话来说,“人类的智慧被远远抛在后面。”

I. J. Good’s prediction of an intelligence explosion (see this page) is one of the driving forces that have led to current concerns about the potential risks of superintelligent AI. If humans can design a machine that is a bit more intelligent than humans, then—the argument goes—that machine will be a bit better than humans at designing machines. It will design a new machine that is still more intelligent, and the process will repeat itself until, in Good’s words, “the intelligence of man would be left far behind.”

人工智能安全研究人员,尤其是伯克利机器智能研究所的研究人员,已经研究了智能爆炸是否可以安全发生的问题。26最初,这似乎不切实际——这难道不是“游戏结束”吗?——但也许还有希望。假设该系列中的第一台机器罗比马克一号一开始就完全了解哈丽特的偏好。他知道自己的认知局限性导致无法完美地取悦哈丽特,因此制造了罗比马克二号。直观地看,罗比马克一号似乎有动机将他对哈丽特偏好的了解融入罗比马克二号,因为这将带来一个更能满足哈丽特偏好的未来——根据第一原理,这正是罗比马克一号的人生目标。按照同样的论证,如果罗比马克一号不确定哈丽特的偏好,那么这种不确定性应该转移到罗比马克二号身上。所以也许爆炸终究是安全的。

Researchers in AI safety, particularly at the Machine Intelligence Research Institute in Berkeley, have studied the question of whether intelligence explosions can occur safely.26 Initially, this might seem quixotic—wouldn’t it just be “game over”?—but there is, perhaps, hope. Suppose the first machine in the series, Robbie Mark I, starts with perfect knowledge of Harriet’s preferences. Knowing that his cognitive limitations lead to imperfections in his attempts to make Harriet happy, he builds Robbie Mark II. Intuitively, it seems that Robbie Mark I has an incentive to build his knowledge of Harriet’s preferences into Robbie Mark II, since that leads to a future where Harriet’s preferences are better satisfied—which is precisely Robbie Mark I’s purpose in life according to the first principle. By the same argument, if Robbie Mark I is uncertain about Harriet’s preferences, that uncertainty should be transferred to Robbie Mark II. So perhaps explosions are safe after all.

从数学角度来看,美中不足的是,鉴于罗比马克二号是更先进的版本,罗比马克一号很难推断出罗比马克二号的行为方式。罗比马克二号的行为中会存在一些罗比马克一号无法回答的问题。27严重的是,我们还没有一个明确的数学定义来说明机器具有特定目的(例如满足哈里特的偏好)在现实中意味着什么。

The fly in the ointment, from a mathematical viewpoint, is that Robbie Mark I will not find it easy to reason about how Robbie Mark II is going to behave, given that Robbie Mark II is, by assumption, a more advanced version. There will be questions about Robbie Mark II’s behavior that Robbie Mark I cannot answer.27 More serious still, we do not yet have a clear mathematical definition of what it means in reality for a machine to have a particular purpose, such as the purpose of satisfying Harriet’s preferences.

让我们来分析一下最后一个问题。以 AlphaGo 为例:它的目的是什么?人们可能会想,这很容易:AlphaGo 的目的是赢得围棋比赛。或者真的如此吗?AlphaGo 肯定不会总是做出必胜的举动。(事实上,它几乎总是输给 AlphaZero。)确实,当离比赛结束只有几步时,如果有必胜的招数,AlphaGo 会选择它。另一方面,当没有必胜的招数时——换句话说,当 AlphaGo 看到对手无论做什么都有必胜的策略时——AlphaGo 会或多或少随机地选择招数。它不会尝试最棘手的招数并希望对手犯错,因为它假设对手会玩得很完美。它表现得好像已经失去了获胜的意志。在其他情况下,当真正最优的招数太难计算时,AlphaGo 有时会犯导致输掉比赛的错误。在这些情况下,AlphaGo 真的想赢在什么意义上是真的?事实上,它的行为可能与一台只想给对手一场真正精彩的比赛的机器相同。

Let’s unpack this last concern a bit. Consider AlphaGo: What purpose does it have? That’s easy, one might think: AlphaGo has the purpose of winning at Go. Or does it? It’s certainly not the case that AlphaGo always makes moves that are guaranteed to win. (In fact, it nearly always loses to AlphaZero.) It’s true that when it’s only a few moves from the end of the game, AlphaGo will pick the winning move if there is one. On the other hand, when no move is guaranteed to win—in other words, when AlphaGo sees that the opponent has a winning strategy no matter what AlphaGo does—then AlphaGo will pick moves more or less at random. It won’t try the trickiest move in the hope that the opponent will make a mistake, because it assumes that its opponent will play perfectly. It acts as if it has lost the will to win. In other cases, when the truly optimal move is too hard to calculate, AlphaGo will sometimes make mistakes that lead to losing the game. In those instances, in what sense is it true that AlphaGo actually wants to win? Indeed, its behavior might be identical to that of a machine that just wants to give its opponent a really exciting game.

因此,说 AlphaGo“以获胜为目标”过于简单化。更好的描述是,AlphaGo 是一个不完善的训练过程(通过自我对弈进行强化学习)的结果,获胜就是奖励。训练过程的不完善之处在于它无法培养出完美的围棋选手:AlphaGo 学习了一种围棋位置评估函数,该函数虽然不错但不完美,并且它结合了一种前瞻搜索,该函数虽然不错但不完美。

So, saying that AlphaGo “has the purpose of winning” is an oversimplification. A better description would be that AlphaGo is the result of an imperfect training process—reinforcement learning with self-play—for which winning was the reward. The training process is imperfect in the sense that it cannot produce a perfect Go player: AlphaGo learns an evaluation function for Go positions that is good but not perfect, and it combines that with a lookahead search that is good but not perfect.

所有这些的结果是,以“假设机器人R具有目的P ”开头的讨论对于获得一些关于事情可能如何展开的直觉是很好的,但它们无法得出关于真实机器的定理。我们需要对机器的目的进行更细致和更精确的定义,然后才能保证它们将如何长期表现。人工智能研究人员才刚刚开始掌握如何分析即使是最简单的真实决策系统,28更不用说足够聪明到可以设计自己的继任者的机器了。我们还有很多工作要做。

The upshot of all this is that discussions beginning with “suppose that robot R has purpose P” are fine for gaining some intuition about how things might unfold, but they cannot lead to theorems about real machines. We need much more nuanced and precise definitions of purposes in machines before we can obtain guarantees of how they will behave over the long term. AI researchers are only just beginning to get a handle on how to analyze even the simplest kinds of real decision-making systems,28 let alone machines intelligent enough to design their own successors. We have work to do.

9

9

并发症:美国

COMPLICATIONS: US

如果这个世界里有一个完全理性的哈丽特和一个乐于助人、恭敬有礼的罗比,那我们就没问题了。罗比会尽可能不引人注意地逐渐了解哈丽特的喜好,并成为她完美的帮手。我们或许希望从这个充满希望的开端推断,或许可以将哈丽特和罗比的关系视为人类与机器之间关系的典范,两者都是单一的。

If the world contained one perfectly rational Harriet and one helpful and deferential Robbie, we’d be in good shape. Robbie would gradually learn Harriet’s preferences as unobtrusively as possible and would become her perfect helper. We might hope to extrapolate from this promising beginning, perhaps viewing Harriet and Robbie’s relationship as a model for the relationship between the human race and its machines, each construed monolithically.

可惜,人类并不是一个单一的、理性的实体。它是由卑鄙的、嫉妒的、非理性的、不一致的、不稳定的、计算能力有限的、复杂的、不断发展的、异质的实体组成的。它们的数量之多。这些问题是社会科学的主要内容,甚至可能是其存在的理由。对于人工智能,我们需要从心理学、经济学、政治理论和道德哲学中汲取思想。1我们需要将这些思想融合、重塑和锤炼成一个足够坚固的结构,以抵御日益智能化的人工智能系统对其造成的巨大压力。这项任务的工作才刚刚开始。

Alas, the human race is not a single, rational entity. It is composed of nasty, envy-driven, irrational, inconsistent, unstable, computationally limited, complex, evolving, heterogeneous entities. Loads and loads of them. These issues are the staple diet—perhaps even the raisons d’être—of the social sciences. To AI we will need to add ideas from psychology, economics, political theory, and moral philosophy.1 We need to melt, re-form, and hammer those ideas into a structure that will be strong enough to resist the enormous strain that increasingly intelligent AI systems will place on it. Work on this task has barely started.

不同的人类

Different Humans

我先从最容易回答的问题开始:人类是异质的。当人们第一次接触到机器应该学会满足人类偏好的想法时,他们常常会反对,因为不同的文化,甚至不同的个体,都有截然不同的价值体系,所以机器不可能有一个正确的价值体系。但当然,这对机器来说不是问题:我们不希望它拥有一个属于自己的正确价值体系;我们只是希望它能预测其他人的偏好。

I will start with what is probably the easiest of the issues: the fact that humans are heterogeneous. When first exposed to the idea that machines should learn to satisfy human preferences, people often object that different cultures, even different individuals, have widely different value systems, so there cannot be one correct value system for the machine. But of course, that’s not a problem for the machine: we don’t want it to have one correct value system of its own; we just want it to predict the preferences of others.

机器难以适应人类的多种偏好,这种误解可能来自于一种错误的想法,即机器会采用它所学习到的偏好——例如,人们认为素食家庭中的家用机器人会采用素食偏好。它不会的。它只需要学会预测素食者的饮食偏好。根据第一原则,它会避免为该家庭烹饪肉类。但机器人也会了解隔壁疯狂食肉动物的饮食偏好,如果主人在周末借用它帮忙举办晚宴,机器人会在主人的允许下很乐意为他们烹饪肉类。除了帮助人类实现偏好之外,机器人没有自己的一套偏好。

The confusion about machines having difficulty with heterogeneous human preferences may come from the mistaken idea that the machine is adopting the preferences it learns—for example, the idea that a domestic robot in a vegetarian household is going to adopt vegetarian preferences. It won’t. It just needs to learn to predict what the dietary preferences of vegetarians are. By the first principle, it will then avoid cooking meat for that household. But the robot also learns about the dietary preferences of the rabid carnivores next door, and, with its owner’s permission, will happily cook meat for them if they borrow it for the weekend to help out with a dinner party. The robot doesn’t have a single set of preferences of its own, beyond the preference for helping humans achieve their preferences.

从某种意义上说,这与餐厅厨师学习烹饪多种不同的菜肴以满足不同顾客的口味,或跨国汽车公司为美国市场生产左侧驾驶汽车、为英国市场生产右侧驾驶汽车没有什么不同。

In a sense, this is no different from a restaurant chef who learns to cook several different dishes to please the varied palates of her clients, or the multinational car company that makes left-hand-drive cars for the US market and right-hand-drive cars for the UK market.

理论上,一台机器可以学习 80 亿个偏好模型,地球上的每个人都可以学习一个。实际上,这并不像听起来那么不可能。一方面,机器很容易相互分享它们所学到的东西。另一方面,人类的偏好结构有很多共同之处,所以机器通常不会从头开始学习每个模型。

In principle, a machine could learn eight billion preference models, one for each person on Earth. In practice, this isn’t as hopeless as it sounds. For one thing, it’s easy for machines to share what they learn with each other. For another, the preference structures of humans have a great deal in common, so the machine will usually not be learning each model from scratch.

想象一下,有一天,加利福尼亚州伯克利市的居民可能会购买家用机器人。这些机器人出厂时就带有相当广泛的先验信念,这些信念可能是为美国市场量身定制的,而不是针对任何特定的城市、政治观点或社会经济阶层。机器人开始遇到伯克利绿党成员,与普通美国人相比,他们更有可能吃素、使用回收箱和堆肥箱、尽可能乘坐公共交通工具等等。每当一个新投入使用的机器人发现自己身处绿党家庭时,它都可以立即调整自己的期望。它不需要开始了解这些特定的人,就好像它以前从未见过人类,更不用说绿党成员了。这种调整并不是不可逆转的——伯克利可能有绿党成员以濒临灭绝的鲸鱼肉为食,驾驶耗油的巨型卡车——但它可以让机器人更快地发挥更大的作用。同样的论点也适用于许多其他个人特征,这些特征在某种程度上可以预测个人的偏好结构。

Imagine, for example, the domestic robots that may one day be purchased by the inhabitants of Berkeley, California. The robots come out of the box with a fairly broad prior belief, perhaps tailored for the US market but not for any particular city, political viewpoint, or socioeconomic class. The robots begin to encounter members of the Berkeley Green Party, who turn out, compared to the average American, to have a much higher probability of being vegetarian, of using recycling and composting bins, of using public transportation whenever possible, and so on. Whenever a newly commissioned robot finds itself in a Green household, it can immediately adjust its expectations accordingly. It does not need to begin learning about these particular humans as if it had never seen a human, let alone a Green Party member, before. This adjustment is not irreversible—there may be Green Party members in Berkeley who feast on endangered whale meat and drive gas-guzzling monster trucks—but it allows the robot to be more useful more quickly. The same argument applies to a vast range of other personal characteristics that are, to some degree, predictive of aspects of an individual’s preference structures.

许多人

Many Humans

人类存在的另一个明显后果是,机器需要在不同人的偏好之间做出权衡。几个世纪以来,人类之间的权衡问题一直是社会科学研究的重点。如果人工智能研究人员希望自己能够在不了解已知知识的情况下简单地找到正确的解决方案,那就太天真了。关于这个主题的文献太多了,我无法在这里一一列举——不仅因为篇幅有限,还因为我没有读过大部分文献。我还应该指出,几乎所有文献都涉及人类做出的决策,而我在这里关注的是机器做出的决策。两者之间有着天壤之别,因为人类拥有个人权利,而这些权利可能与任何代表他人行事的义务相冲突,而机器则不会。例如,我们不会期望或要求普通人类牺牲自己的生命来拯救他人,而我们肯定会要求机器人牺牲自己的生命来拯救人类的生命。

The other obvious consequence of the existence of more than one human being is the need for machines to make trade-offs among the preferences of different people. The issue of trade-offs among humans has been the main focus of large parts of the social sciences for centuries. It would be naïve for AI researchers to expect that they can simply alight on the correct solutions without understanding what is already known. The literature on the topic is, alas, vast and I cannot possibly do justice to it here—not just because there isn’t space but also because I haven’t read most of it. I should also point out that almost all the literature is concerned with decisions made by humans, whereas I am concerned here with decisions made by machines. This makes all the difference in the world, because humans have individual rights that may conflict with any supposed obligation to act on behalf of others, whereas machines do not. For example, we do not expect or require typical humans to sacrifice their lives to save others, whereas we will certainly require robots to sacrifice their existence to save the lives of humans.

几千年来,哲学家、经济学家、法学家和政治学家的努力创造了宪法、法律、经济体系和社会规范,它们有助于(或阻碍,取决于谁来负责)达成权衡问题的令人满意的解决方案。道德哲学家尤其一直在从行为对他人产生的影响(无论是有利的还是不利的)的角度来分析行为的正确性概念。自 18 世纪以来,他们一直在功利主义的指导下研究权衡的量化模型。这项工作与我们当前的问题直接相关,因为它试图定义一个公式,通过该公式可以代表许多个人做出道德决策。

Several thousand years of work by philosophers, economists, legal scholars, and political scientists have produced constitutions, laws, economic systems, and social norms that serve to help (or hinder, depending on who’s in charge) the process of reaching satisfactory solutions to the problem of trade-offs. Moral philosophers in particular have been analyzing the notion of rightness of actions in terms of their effects, beneficial or otherwise, on other people. They have studied quantitative models of trade-offs since the eighteenth century under the heading of utilitarianism. This work is directly relevant to our present concerns, because it attempts to define a formula by which moral decisions can be made on behalf of many individuals.

即使每个人的偏好结构相同,也需要做出权衡,因为通常不可能最大限度地满足每个人的偏好。例如,如果每个人都想成为宇宙的全能统治者,那么大多数人都会失望。另一方面,异质性确实使一些问题变得更加困难:如果每个人都对天空是蓝色感到高兴,那么处理大气问题的机器人就可以努力保持这种状态;但如果许多人都鼓动改变颜色,机器人就需要考虑可能的妥协,比如每个月的第三个星期五天空是橙色。

The need to make trade-offs arises even if everyone has the same preference structure, because it’s usually impossible to maximally satisfy everyone’s preferences. For example, if everyone wants to be All-Powerful Ruler of the Universe, most people are going to be disappointed. On the other hand, heterogeneity does make some problems more difficult: if everyone is happy with the sky being blue, the robot that handles atmospheric matters can work on keeping it that way; but if many people are agitating for a color change, the robot will need to think about possible compromises such as an orange sky on the third Friday of each month.

世界上不止一个人的存在还有另一个重要后果:这意味着,对于每个人来说,都有其他人需要关心。这意味着满足一个人的偏好会对其他人产生影响,这取决于这个人对其他人福祉的偏好。

The presence of more than one person in the world has another important consequence: it means that, for each person, there are other people to care about. This means that satisfying the preferences of an individual has implications for other people, depending on the individual’s preferences about the well-being of others.

忠诚的人工智能

Loyal AI

让我们先来提出一个非常简单的建议,即机器应该如何处理多个人类的存在:它们应该忽略它。也就是说,如果 Harriet 拥有 Robbie,那么 Robbie 应该只关注 Harriet 的偏好。这种忠诚的人工智能形式绕过了权衡问题,但它会导致问题:

Let’s begin with a very simple proposal for how machines should deal with the presence of multiple humans: they should ignore it. That is, if Harriet owns Robbie, then Robbie should pay attention only to Harriet’s preferences. This loyal form of AI bypasses the issue of trade-offs, but it leads to problems:

罗比:你丈夫打电话来提醒你今晚的晚餐。

哈丽特:等一下!什么?什么晚餐?

罗比:七点钟,庆祝你们二十周年结婚纪念日。

哈里特:不行!我七点半要和秘书长见面!怎么会这样?

罗比:我确实警告过你,但你却不顾我的建议……

哈丽特:好的,抱歉——但我现在该怎么办?我不能直接告诉 SG 我太忙了!

罗比:别担心。我让她的飞机晚点——可能是电脑出了什么故障。

HA RRIET:真的吗?你能做到吗?!

罗比:秘书长向您致以深深的歉意,并很高兴明天与您共进午餐。

ROBBIE: Your husband called to remind you about dinner tonight.

HARRIET: Wait! What? What dinner?

ROBBIE: For your twentieth anniversary, at seven.

HARRIET: I can’t! I’m meeting the secretary-general at seven thirty! How did this happen?

ROBBIE: I did warn you, but you overrode my recommendation. . . .

HARRIET: OK, sorry—but what am I going to do now? I can’t just tell the SG I’m too busy!

ROBBIE: Don’t worry. I arranged for her plane to be delayed—some kind of computer malfunction.

HARRIET: Really? You can do that?!

ROBBIE: The secretary-general sends her profound apologies and is happy to meet you for lunch tomorrow.

在这里,罗比为哈里特的问题找到了一个巧妙的解决方案,但他的行为却对其他人产生了负面影响。如果哈里特是一个道德严谨、无私的人,那么旨在满足哈里特喜好的罗比绝不会想实施这样一个可疑的计划。但如果哈里特根本不在乎别人的喜好呢?在这种情况下,罗比不会介意延误航班。他难道不会花时间从网上银行账户中窃取资金来充实哈里特的金库,甚至做更糟的事情吗?

Here, Robbie has found an ingenious solution to Harriet’s problem, but his actions have had a negative impact on other people. If Harriet is a morally scrupulous and altruistic person, then Robbie, who aims to satisfy Harriet’s preferences, will never dream of carrying out such a dubious scheme. But what if Harriet doesn’t give a fig for the preferences of others? In that case, Robbie won’t mind delaying planes. And might he not spend his time pilfering money from online bank accounts to swell indifferent Harriet’s coffers, or worse?

显然,忠诚的机器的行为需要受到规则和禁令的约束,就像人类的行为一样受到法律和社会规范的约束。一些人提出严格责任作为解决方案:2 Harriet(或 Robbie 的制造商,取决于您喜欢将责任归咎于谁)对 Robbie 的任何行为负有经济和法律责任,就像在大多数州,如果狗在公园咬伤小孩,狗主人要承担责任一样。这个想法听起来很有希望,因为 Robbie 会有动力避免做任何会让 Harriet 陷入麻烦的事情。不幸的是,严格责任不起作用:它只是确保 Robbie 在延误飞机和代表 Harriet 偷钱时行为不被发现。这是漏洞原则运作的另一个例子。如果 Robbie 忠于不择手段的 Harriet,那么试图用规则来约束他的行为可能会失败。

Obviously, the actions of loyal machines will need to be constrained by rules and prohibitions, just as the actions of humans are constrained by laws and social norms. Some have proposed strict liability as a solution:2 Harriet (or Robbie’s manufacturer, depending on where you prefer to place the liability) is financially and legally responsible for any act carried out by Robbie, just as a dog’s owner is liable in most states if the dog bites a small child in a public park. This idea sounds promising because Robbie would then have an incentive to avoid doing anything that would land Harriet in trouble. Unfortunately, strict liability doesn’t work: it simply ensures that Robbie will act undetectably when he delays planes and steals money on Harriet’s behalf. This is another example of the loophole principle in operation. If Robbie is loyal to an unscrupulous Harriet, attempts to contain his behavior with rules will probably fail.

即使我们能以某种方式防止直接犯罪,为冷漠的哈丽特工作的忠诚的罗比也会表现出其他令人不快的行为。如果他在超市买杂货,他会尽可能地在收银台插队。如果他把杂货带回家,而路人心脏病发作,他会不顾一切地继续买,以免哈丽特的冰淇淋融化。总之,他会找到无数种以牺牲他人为代价来造福哈丽特的方法——这些方法严格来说是合法的,但大规模实施时就变得无法容忍。社会每天都会通过数百条新法律来抵消机器在现有法律中发现的所有漏洞。人类往往不会利用这些漏洞,要么是因为他们一般了解潜在的道德原则,要么是因为他们缺乏发现漏洞所需的聪明才智。

Even if we can somehow prevent the outright crimes, a loyal Robbie working for an indifferent Harriet will exhibit other unpleasant behaviors. If he is buying groceries at the supermarket, he will cut in line at the checkout whenever possible. If he is bringing the groceries home and a passerby suffers a heart attack, he will carry on regardless, lest Harriet’s ice cream melt. In summary, he will find innumerable ways to benefit Harriet at the expense of others—ways that are strictly legal but become intolerable when carried out on a large scale. Societies will find themselves passing hundreds of new laws every day to counteract all the loopholes that machines will find in existing laws. Humans tend not to take advantage of these loopholes, either because they have a general understanding of the underlying moral principles or because they lack the ingenuity required to find the loopholes in the first place.

一个对他人福祉漠不关心的哈丽特已经够糟糕了。一个喜欢别人受苦的虐待狂哈丽特则更糟糕。一个为满足这种哈丽特的偏好而设计的罗比将是一个严重的问题,因为他会寻找并找到伤害他人的方法,以满足哈丽特的快乐,无论是合法的还是非法的,但都无法被发现。他当然需要向哈丽特汇报,这样她才能从了解他的恶行中获得乐趣。

A Harriet who is indifferent to the well-being of others is bad enough. A sadistic Harriet who actively prefers the suffering of others is far worse. A Robbie designed to satisfy the preferences of such a Harriet would be a serious problem, because he would look for—and find—ways to harm others for Harriet’s pleasure, either legally or illegally but undetectably. He would of course need to report back to Harriet so she could derive enjoyment from the knowledge of his evil deeds.

因此,忠诚人工智能的理念似乎很难实现,除非除了考虑主人的偏好之外,还考虑其他人类的偏好。

It seems difficult, then, to make the idea of a loyal AI work, unless the idea is extended to include consideration of the preferences of other humans, in addition to the preferences of the owner.

功利型人工智能

Utilitarian AI

我们之所以有道德哲学,是因为地球上不止一个人。与理解应如何设计人工智能系统最相关的方法通常被称为结果主义:即应根据预期后果来判断选择的观点。另外两种主要方法是义务论伦理学美德伦理学,它们粗略地关注行为和个人的道德品质,与选择的后果毫不相关。3在没有任何证据表明机器具有自我意识的情况下,我认为,如果后果对人类极为不利,那么制造有美德或根据道德规则选择行为的机器就没有什么意义。换句话说,我们制造机器是为了带来后果,我们应该更倾向于制造带来我们喜欢的后果的机器。

The reason we have moral philosophy is that there is more than one person on Earth. The approach that is most relevant for understanding how AI systems should be designed is often called consequentialism: the idea that choices should be judged according to expected consequences. The two other principal approaches are deontological ethics and virtue ethics, which are, very roughly, concerned with the moral character of actions and individuals, respectively, quite apart from the consequences of choices.3 Absent any evidence of self-awareness on the part of machines, I think it makes little sense to build machines that are virtuous or that choose actions in accordance with moral rules if the consequences are highly undesirable for humanity. Put another way, we build machines to bring about consequences, and we should prefer to build machines that bring about consequences that we prefer.

这并不是说道德规则和美德无关紧要;只是,对于功利主义者来说,它们在后果和这些后果的更实际的实现方面是合理的。约翰·斯图尔特·密尔在《功利主义》中提出了这一观点:

This is not to say that moral rules and virtues are irrelevant; it’s just that, for the utilitarian, they are justified in terms of consequences and the more practical achievement of those consequences. This point is made by John Stuart Mill in Utilitarianism:

幸福是道德的最终目标这一命题并不意味着不应该铺设通往幸福的道路,也不意味着不应该建议人们选择一条路而不是另一条路。……没有人会争辩说航海术不是基于天文学,因为水手们迫不及待地要计算航海年历。因为他们是理性的生物,所以水手们出海时已经做好了计算;所有理性的生物在人生的海洋上航行时,他们的心已经决定了要去哪里。关于对与错的常见问题,以及关于明智与愚蠢的许多更难的问题。

The proposition that happiness is the end and aim of morality doesn’t mean that no road ought to be laid down to that goal, or that people going to it shouldn’t be advised to take one direction rather than another. . . . Nobody argues that the art of navigation is not based on astronomy because sailors can’t wait to calculate the Nautical Almanack. Because they are rational creatures, sailors go to sea with the calculations already done; and all rational creatures go out on the sea of life with their minds made up on the common questions of right and wrong, as well as on many of the much harder questions of wise and foolish.

这种观点与以下观点完全一致:面对现实世界的极大复杂性,有限的机器可以通过遵循道德规则和采取道德态度来产生更好的结果,而不是试图从头开始计算最佳行动方案。同样,国际象棋程序使用标准开局顺序、残局算法和评估函数的目录来实现将军,而不是试图在没有“道德”指导的情况下推理出将军的方法。结果主义方法也给予那些坚信保留特定义务论规则的人的偏好一定的权重,因为规则被打破的不快是真实的后果。然而,这不是无限权重的结果。

This view is entirely consistent with the idea that a finite machine facing the immense complexity of the real world may produce better consequences by following moral rules and adopting a virtuous attitude rather than trying to calculate the optimal course of action from scratch. In the same way, a chess program achieves checkmate more often using a catalog of standard opening move sequences, endgame algorithms, and an evaluation function, rather than trying to reason its way to checkmate with no “moral” guideposts. A consequentialist approach also gives some weight to the preferences of those who believe strongly in preserving a given deontological rule, because unhappiness that a rule has been broken is a real consequence. However, it is not a consequence of infinite weight.

后果主义是一个很难反驳的原则——尽管许多人都尝试过!——因为以后果主义会产生不良后果为由反对后果主义是站不住脚的。人们不能说:“但是如果你在这样那样的情况下遵循后果主义的方法,那么这种非常可怕的事情就会发生!”任何这样的失败都只是该理论被误用的证据。

Consequentialism is a difficult principle to argue against—although many have tried!—because it’s incoherent to object to consequentialism on the grounds that it would have undesirable consequences. One cannot say, “But if you follow the consequentialist approach in such-and-such case, then this really terrible thing will happen!” Any such failings would simply be evidence that the theory had been misapplied.

例如,假设哈丽特想攀登珠穆朗玛峰。人们可能会担心结果主义者罗比会简单地把她抱起来,放在珠穆朗玛峰顶上,因为这是她想要的结果。哈丽特很可能会强烈反对这个计划,因为这会剥夺她挑战的机会,从而剥夺她通过自己的努力完成一项艰巨任务所带来的喜悦。现在,显然,一个设计合理的结果主义者罗比会明白,后果包括哈丽特的所有经历,而不仅仅是最终目标。他可能希望在发生事故时随时待命,并确保她装备齐全、训练有素,但他或许还必须接受哈丽特冒着巨大死亡风险的权利。

For example, suppose Harriet wants to climb Everest. One might worry that a consequentialist Robbie would simply pick her up and deposit her on top of Everest, since that is her desired consequence. In all probability Harriet would strenuously object to this plan, because it would deprive her of the challenge and therefore of the exultation that results from succeeding in a difficult task through one’s own efforts. Now, obviously, a properly designed consequentialist Robbie would understand that the consequences include all of Harriet’s experiences, not just the end goal. He might want to be available in case of an accident and to make sure she was properly equipped and trained, but he might also have to accept Harriet’s right to expose herself to an appreciable risk of death.

如果我们计划建造结果主义机器,那么下一个问题就是如何评估影响多人的后果。一个合理的答案是对每个人的偏好给予同等重视 — — 换句话说,最大化每个人的效用总和。这个答案通常归功于 18 世纪英国哲学家杰里米·边沁4和他的学生约翰·斯图尔特·密尔5,他们发展了功利主义的哲学方法。这一基本思想可以追溯到古希腊哲学家伊壁鸠鲁的作品,并明确出现在《墨子》中,这是一本归于同名中国哲学家的著作。墨子活跃于公元前 5 世纪末,并倡导“兼爱”的理念,也被翻译为“包容性关怀”或“兼爱”,作为道德行为的决定性特征。

If we plan to build consequentialist machines, the next question is how to evaluate consequences that affect multiple people. One plausible answer is to give equal weight to everyone’s preferences—in other words, to maximize the sum of everyone’s utilities. This answer is usually attributed to the eighteenth-century British philosopher Jeremy Bentham4 and his pupil John Stuart Mill,5 who developed the philosophical approach of utilitarianism. The underlying idea can be traced to the works of the ancient Greek philosopher Epicurus and appears explicitly in Mozi, a book of writings attributed to the Chinese philosopher of the same name. Mozi was active at the end of the fifth century BCE and promoted the idea of jian ai, variously translated as “inclusive care” or “universal love,” as the defining characteristic of moral actions.

功利主义名声不太好,部分原因是人们对其主张存在误解。(功利主义一词的意思是“实用或实用,而非美观”,这当然无济于事。)功利主义通常被认为与个人权利不相容,因为功利主义者会毫不犹豫地未经允许摘除活人的器官以挽救其他五个人的生命;当然,这样的政策会让地球上每个人的生命都变得难以忍受,所以功利主义者甚至不会考虑。功利主义也被错误地等同于毫无吸引力的总财富最大化,被认为不重视诗歌或痛苦。事实上,边沁的版本特别关注人类的幸福,而密尔则自信地宣称,智力上的愉悦比单纯的感觉更有价值。 (“宁做不满足的人,不做满足的猪。”) GE摩尔的理想功利主义则走得更远,他主张将内在价值的心理状态最大化,而美的审美观照正是其典型代表。

Utilitarianism has something of a bad name, partly because of simple misunderstandings about what it advocates. (It certainly doesn’t help that the word utilitarian means “designed to be useful or practical rather than attractive.”) Utilitarianism is often thought to be incompatible with individual rights, because a utilitarian would, supposedly, think nothing of removing a living person’s organs without permission to save the lives of five others; of course, such a policy would render life intolerably insecure for everyone on Earth, so a utilitarian wouldn’t even consider it. Utilitarianism is also incorrectly identified with a rather unattractive maximization of total wealth and is thought to give little weight to poetry or suffering. In fact, Bentham’s version focused specifically on human happiness, while Mill confidently asserted the far greater value of intellectual pleasures over mere sensations. (“It is better to be a human being dissatisfied than a pig satisfied.”) The ideal utilitarianism of G. E. Moore went even further: he advocated the maximization of mental states of intrinsic worth, epitomized by the aesthetic contemplation of beauty.

我认为功利主义哲学家没有必要规定人类效用或人类偏好的理想内容。(人工智能研究人员更没有理由这样做。)人类可以自己做到这一点。经济学家约翰·哈萨尼(John Harsanyi)用他的偏好自主原则提出了这一观点:6

I think there is no need for utilitarian philosophers to stipulate the ideal content of human utility or human preferences. (And even less reason for AI researchers to do so.) Humans can do that for themselves. The economist John Harsanyi propounded this view with his principle of preference autonomy:6

在决定对于某个人来说什么是好什么是坏时,最终的标准只能是他自己的愿望和自己的偏好。

In deciding what is good and what is bad for a given individual, the ultimate criterion can only be his own wants and his own preferences.

因此,哈萨尼的偏好功利主义与有益人工智能的第一原则大致一致,该原则认为机器的唯一目的是实现人类的偏好。人工智能研究人员绝对不应该参与决定人类的偏好应该是什么!与边沁一样,哈萨尼将这些原则视为公共决策的指南;他并不期望个人如此无私。他也不期望个人完全理性——例如,他们可能有与他们的“更深层次的偏好”相矛盾的短期欲望。最后,他建议忽略那些像前面提到的虐待狂哈丽特一样积极希望减少他人福祉的人的偏好。

Harsanyi’s preference utilitarianism is therefore roughly consistent with the first principle of beneficial AI, which says that a machine’s only purpose is the realization of human preferences. AI researchers should definitely not be in the business of deciding what human preferences should be! Like Bentham, Harsanyi views such principles as a guide for public decisions; he does not expect individuals to be so selfless. Nor does he expect individuals to be perfectly rational—for example, they might have short-term desires that contradict their “deeper preferences.” Finally, he proposes to ignore the preferences of those who, like the sadistic Harriet mentioned earlier, actively wish to reduce the well-being of others.

哈萨尼还给出了一种证明,即最佳道德决策应该使整个人类群体的平均效用最大化。7假设的假设相当弱,类似于个人效用理论所依据的假设。(主要的附加假设是,如果群体中的每个人都对两种结果无动于衷,那么代表群体行事的代理人对这两个结果也应该无动于衷。)根据这些假设,他证明了所谓的社会聚合定理:代表个体群体行事的代理人必须最大化个体效用的加权线性组合。他进一步指出,“非个人”代理人应该使用相等的权重。

Harsanyi also gives a kind of proof that optimal moral decisions should maximize the average utility across a population of humans.7 He assumes fairly weak postulates similar to those that underlie utility theory for individuals. (The primary additional postulate is that if everyone in a population is indifferent between two outcomes, then an agent acting on behalf of the population should be indifferent between those outcomes.) From these postulates, he proves what became known as the social aggregation theorem: an agent acting on behalf of a population of individuals must maximize a weighted linear combination of the utilities of the individuals. He further argues that an “impersonal” agent should use equal weights.

该定理需要一个关键的附加(未说明)假设:每个人对世界及其发展方式都有相同的先验事实信念。现在,任何父母都知道,这对兄弟姐妹来说都不成立,更不用说来自不同社会背景和文化的人了。那么,当个人的信念不同时会发生什么?这很奇怪:8分配给每个人效用的权重必须随着时间的推移而变化,与该个人的先验信念与不断发展的现实的吻合程度成正比。

The theorem requires one crucial additional (and unstated) assumption: each individual has the same prior factual beliefs about the world and how it will evolve. Now, any parent knows that this isn’t even true for siblings, let alone individuals from different social backgrounds and cultures. So, what happens when individuals differ in their beliefs? Something rather strange:8 the weight assigned to each individual’s utility has to change over time, in proportion to how well that individual’s prior beliefs accord with unfolding reality.

对于任何父母来说,这种听起来相当不平等的公式都很熟悉。假设机器人罗比负责照顾两个孩子,爱丽丝和鲍勃。爱丽丝想去看电影,并且确定今天会下雨;另一方面,鲍勃想去海滩,并且确定天气会是晴天。罗比可以宣布“我们要去看电影”,这会让鲍勃不高兴;或者他可以宣布“我们要去海滩”,这会让爱丽丝不高兴;或者他可以宣布“如果下雨,我们就去看电影,但如果天气晴朗,我们就去海滩。”最后一个计划让爱丽丝和鲍勃都很高兴,因为他们都相信自己的信念。

This rather inegalitarian-sounding formula is quite familiar to any parent. Let’s say that Robbie the robot has been tasked with looking after two children, Alice and Bob. Alice wants to go to the movies and is sure it’s going to rain today; Bob, on the other hand, wants to go to the beach and is sure it’s going to be sunny. Robbie could announce, “We’re going to the movies,” making Bob unhappy; or he could announce, “We’re going to the beach,” making Alice unhappy; or he could announce, “If it rains, we’re going to the movies, but if it’s sunny, we’ll go to the beach.” This last plan makes both Alice and Bob happy, because both believe in their own beliefs.

对功利主义的挑战

Challenges to utilitarianism

功利主义是人类长期以来寻求道德指南的产物之一;在众多此类提案中,功利主义是最明确的——因此也最容易出现漏洞。哲学家们已经发现了这些漏洞一百多年了。例如,GE Moore 反对边沁对最大化快乐的强调,他想象了一个“除了快乐之外什么都不存在的世界——没有知识、没有爱、没有对美的享受、没有道德品质。” 9这一观点在现代得到了回应,斯图尔特·阿姆斯特朗的观点是,负责最大化快乐的超级智能机器可能会“把每个人都埋在水泥棺材里,滴上海洛因。” 10另一个例子:1945 年,卡尔·波普尔提出了将人类痛苦最小化的值得称赞的目标,11认为用一个人的痛苦换取另一个人的痛苦是不道德的愉悦感;RN Smart 回应说,实现这一目标的最佳方式是让人类灭绝。12如今,机器可能通过结束我们的存在来结束人类的苦难,这一想法已成为人工智能生存风险辩论的焦点。13第三个例子是 GE Moore 强调幸福源泉的现实性,修正了早期似乎存在漏洞的定义,允许通过自我欺骗来最大化幸福感。这一观点的现代类似物包括《黑客帝国》(其中当今的现实原来是计算机模拟产生的幻觉)和最近关于强化学习中自我欺骗问题的研究。14

Utilitarianism is one proposal to emerge from humanity’s long-standing search for a moral guide; among many such proposals, it is the most clearly specified—and therefore the most susceptible to loopholes. Philosophers have been finding these loopholes for more than a hundred years. For example, G. E. Moore, objecting to Bentham’s emphasis on maximizing pleasure, imagined a “world in which absolutely nothing except pleasure existed—no knowledge, no love, no enjoyment of beauty, no moral qualities.”9 This finds its modern echo in Stuart Armstrong’s point that superintelligent machines tasked with maximizing pleasure might “entomb everyone in concrete coffins on heroin drips.”10 Another example: in 1945, Karl Popper proposed the laudable goal of minimizing human suffering,11 arguing that it was immoral to trade one person’s pain for another person’s pleasure; R. N. Smart responded that this could best be achieved by rendering the human race extinct.12 Nowadays, the idea that a machine might end human suffering by ending our existence is a staple of debates over the existential risk from AI.13 A third example is G. E. Moore’s emphasis on the reality of the source of happiness, amending earlier definitions that seemed to have a loophole allowing maximization of happiness through self-delusion. The modern analogs of this point include The Matrix (in which present-day reality turns out to be an illusion produced by a computer simulation) and recent work on the self-delusion problem in reinforcement learning.14

这些例子以及其他例子让我相信,人工智能界应该认真关注关于功利主义的哲学和经济争论的正反两方面,因为它们与当前的任务直接相关。从设计将使多个人受益的人工智能系统的角度来看,其中最重要的两个方面涉及效用的人际比较和不同人口规模的效用比较。这两场争论已经持续了 150 多年,这让人怀疑它们的令人满意的解决可能并不完全简单。

These examples, and more, convince me that the AI community should pay careful attention to the thrusts and counterthrusts of philosophical and economic debates on utilitarianism because they are directly relevant to the task at hand. Two of the most important, from the point of view of designing AI systems that will benefit multiple individuals, concern interpersonal comparisons of utilities and comparisons of utilities across different population sizes. Both of these debates have been raging for 150 years or more, which leads one to suspect their satisfactory resolution may not be entirely straightforward.

关于效用的人际比较的争论很重要,因为除非可以将效用相加,否则罗比无法最大化爱丽丝和鲍勃效用的总和;而且只有当它们可以在同一尺度上衡量时,才能将它们相加。19 世纪英国逻辑学家和经济学家威廉·斯坦利·杰文斯(也是早期机械计算机逻辑钢琴的发明者)在 1871 年提出,人际比较是不可能的:15

The debate on interpersonal comparisons of utilities matters because Robbie cannot maximize the sum of Alice’s and Bob’s utilities unless those utilities can be added; and they can be added only if they are measurable on the same scale. The nineteenth-century British logician and economist William Stanley Jevons (also the inventor of an early mechanical computer called the logical piano) argued in 1871 that interpersonal comparisons are impossible:15

据我们所知,一种思想的敏感性可能比另一种思想高出一千倍。但是,假设敏感性在各个方向上都以相同的比例不同,我们就永远无法发现最深刻的差异。因此,每个人的思想都无法理解其他思想,并且不可能存在任何共同的感觉。

The susceptibility of one mind may, for what we know, be a thousand times greater than that of another. But, provided that the susceptibility was different in a like ratio in all directions, we should never be able to discover the profoundest difference. Every mind is thus inscrutable to every other mind, and no common denominator of feeling is possible.

美国经济学家、现代社会选择理论的创始人、1972 年诺贝尔经济学奖获得者肯尼斯·阿罗也同样坚定地表示:

The American economist Kenneth Arrow, founder of modern social choice theory and 1972 Nobel laureate, was equally adamant:

这里的观点是,人际效用的比较是没有意义的,事实上,在个人效用的可测量性方面,福利比较没有任何意义。

The viewpoint will be taken here that interpersonal comparison of utilities has no meaning and, in fact, there is no meaning relevant to welfare comparisons in the measurability of individual utility.

杰文斯和阿罗提到的困难在于,没有明显的方法可以判断爱丽丝在主观幸福感方面对针扎和棒棒糖的评价是 -1 和 +1 还是 -1000 和 +1000。无论是哪种情况,她都会支付最多一根棒棒糖来避免针扎。事实上,如果爱丽丝是一个类人机器人,即使她没有任何主观幸福感,她的外部行为也可能是一样的。

The difficulty to which Jevons and Arrow are referring is that there is no obvious way to tell if Alice values pinpricks and lollipops at −1 and +1 or −1000 and +1000 in terms of her subjective experience of happiness. In either case, she will pay up to one lollipop to avoid one pinprick. Indeed, if Alice is a humanoid automaton, her external behavior might be the same even though there is no subjective experience of happiness whatsoever.

1974 年,美国哲学家罗伯特·诺齐克提出,即使可以进行人际效用比较,最大化效用总和仍不是一个好主意,因为它会触犯效用怪兽——一个比普通人经历的快乐和痛苦要强烈许多倍的人。16这样的人可以断言,如果给他而不是给别人,任何额外的资源单位都会为人类幸福的总量带来更大的增量;事实上,从别人那里夺取资源来造福效用怪兽也是一个好主意。

In 1974, the American philosopher Robert Nozick suggested that even if interpersonal comparisons of utility could be made, maximizing the sum of utilities would still be a bad idea because it would fall foul of the utility monster—a person whose experiences of pleasure and pain are many times more intense than those of ordinary people.16 Such a person could assert that any additional unit of resources would yield a greater increment to the sum total of human happiness if given to him rather than to others; indeed, removing resources from others to benefit the utility monster would also be a good idea.

这似乎是一个明显不受欢迎的结果,但结果主义本身并不能解决问题:问题在于我们如何衡量结果的可取性。一种可能的回应是,效用怪物只是理论上的——没有这样的人。但这种回应可能不行:从某种意义上说,相对于老鼠和细菌,所有人类都是效用怪物,这就是为什么我们在制定公共政策时很少关注老鼠和细菌的偏好。

This might seem to be an obviously undesirable consequence, but consequentialism by itself cannot come to the rescue: the problem lies in how we measure the desirability of consequences. One possible response is that the utility monster is merely theoretical—there are no such people. But this response probably won’t do: in a sense, all humans are utility monsters relative to, say, rats and bacteria, which is why we pay little attention to the preferences of rats and bacteria in setting public policy.

如果认为不同的实体具有不同的效用尺度已经融入我们的思维方式,那么不同的人也可能有不同的尺度。

If the idea that different entities have different utility scales is already built into our way of thinking, then it seems entirely possible that different people have different scales too.

另一种回应是说“运气真不好!”并假设每个人都有相同的量表,即使他们没有。17人们还可以尝试使用杰文斯无法使用的科学方法来研究这个问题,比如测量多巴胺水平或与快乐和痛苦、幸福和悲惨相关的神经元的电兴奋程度。如果爱丽丝和鲍勃对棒棒糖的化学和神经反应几乎相同,他们的行为反应(微笑、咂嘴声等)也完全相同,那么坚持认为他们的主观享受程度相差一千或一百万倍似乎很奇怪。最后,人们可以使用时间等通用货币(我们所有人的时间大致相同)——例如,通过比较棒棒糖和针扎与机场候机室额外五分钟的等候时间。

Another response is to say “Tough luck!” and operate on the assumption that everyone has the same scale, even if they don’t.17 One could also try to investigate the issue by scientific means unavailable to Jevons, such as measuring dopamine levels or the degree of electrical excitation of neurons related to pleasure and pain, happiness and misery. If Alice’s and Bob’s chemical and neural responses to a lollipop are pretty much identical, as well as their behavioral responses (smiling, making lip-smacking noises, and so on), it seems odd to insist that, nevertheless, their subjective degrees of enjoyment differ by a factor of a thousand or a million. Finally, one could use common currencies such as time (of which we all have, very roughly, the same amount)—for example, by comparing lollipops and pinpricks against, say, five minutes extra waiting time in the airport departure lounge.

我远没有杰文斯和阿罗悲观。我认为,比较不同个体的效用确实很有意义,尺度可能有所不同,但通常差别不大,机器可以从对人类偏好尺度的合理广泛先验信念开始,通过长期观察更多地了解个体尺度,或许可以将自然观察与神经科学研究的发现联系起来。

I am far less pessimistic than Jevons and Arrow. I suspect that it is indeed meaningful to compare utilities across individuals, that scales may differ but typically not by very large factors, and that machines can begin with reasonably broad prior beliefs about human preference scales and learn more about the scales of individuals by observation over time, perhaps correlating natural observations with the findings of neuroscience research.

第二个争论是关于不同规模的人口之间的效用比较,当决策会影响到未来谁将存在时,这个问题就变得很重要。例如,在电影《复仇者联盟:无限战争》中,灭霸这个角色提出并实施了这样一种理论:如果人口数量减少一半,那么剩下的每个人的幸福感都会增加一倍以上。这种天真的计算让功利主义名声扫地。18

The second debate—about utility comparisons across populations of different sizes—matters when decisions have an impact on who will exist in the future. In the movie Avengers: Infinity War, for example, the character Thanos develops and implements the theory that if there were half as many people, everyone who remained would be more than twice as happy. This is the kind of naïve calculation that gives utilitarianism a bad name.18

1874 年,英国哲学家亨利·西奇威克在其著名论文《伦理学方法》中讨论了同样的问题(除了无限宝石和庞大的预算)。19西奇威克显然同意灭霸的观点,他认为正确的选择是来调整人口规模,直到达到最大幸福感。 (显然,这并不意味着无限制地增加人口,因为在某个时候每个人都会饿死,从而变得不幸福。)1984 年,英国哲学家德里克·帕菲特在其开创性著作《理由与人》中再次提出了这个问题。20帕菲特认为,对于任何有N个非常幸福的人的人口情形,(根据功利主义原则)存在一个更可取的情形,即有 2 N个人幸福感稍差一些。这似乎非常有道理。不幸的是,这也是一个滑坡。通过重复这个过程,我们得出了所谓的令人厌恶的结论(通常大写,也许是为了强调它的维多利亚时代根源):最理想的情形是人口众多,所有人的生活都几乎不值得过。

The same question—minus the Infinity Stones and the gargantuan budget—was discussed in 1874 by the British philosopher Henry Sidgwick in his famous treatise, The Methods of Ethics.19 Sidgwick, in apparent agreement with Thanos, concluded that the right choice was to adjust the population size until the maximum total happiness was reached. (Obviously, this does not mean increasing the population without limit, because at some point everyone would be starving to death and hence rather unhappy.) In 1984, the British philosopher Derek Parfit took up the issue again in his groundbreaking work Reasons and Persons.20 Parfit argues that for any situation with a population of N very happy people, there is (according to utilitarian principles) a preferable situation with 2N people who are ever so slightly less happy. This seems highly plausible. Unfortunately, it’s also a slippery slope. By repeating the process, we reach the so-called Repugnant Conclusion (usually capitalized thus, perhaps to emphasize its Victorian roots): that the most desirable situation is one with a vast population, all of whom have a life barely worth living.

可以想象,这样的结论是有争议的。帕菲特本人也为解决自己的难题奋斗了三十多年,但始终没有成功。我怀疑我们缺少一些基本公理,类似于个人理性偏好的公理,来处理不同规模和幸福水平的人群之间的选择。21

As you can imagine, such a conclusion is controversial. Parfit himself struggled for over thirty years to find a solution to his own conundrum, without success. I suspect we are missing some fundamental axioms, analogous to those for individually rational preferences, to handle choices between populations of different sizes and happiness levels.21

解决这个问题很重要,因为具有足够远见的机器可能能够考虑导致不同人口规模的行动方案,就像中国政府在 1979 年实施独生子女政策一样。例如,我们很可能会要求人工智能系统帮助制定全球气候变化的解决方案——而这些解决方案很可能涉及限制甚至减少人口规模的政策。22另一方面,如果我们认为人口越多越好,如果我们非常重视几个世纪后可能庞大的人类人口的福祉,那么我们将需要更加努力地寻找超越地球限制的方法。如果机器的计算得出令人厌恶的结论或其相反结论——极少数人处于最佳幸福状态——我们可能有理由为我们在这个问题上缺乏进展而感到遗憾。

It is important that we solve this problem, because machines with sufficient foresight may be able to consider courses of action leading to different population sizes, just as the Chinese government did with its one-child policy in 1979. It’s quite likely, for example, that we will be asking AI systems for help in devising solutions for global climate change—and those solutions may well involve policies that tend to limit or even reduce population size.22 On the other hand, if we decide that larger populations really are better and if we give significant weight to the well-being of potentially vast human populations centuries from now, then we will need to work much harder on finding ways to move beyond the confines of Earth. If the machines’ calculations lead to the Repugnant Conclusion or to its opposite—a tiny population of optimally happy people—we may have reason to regret our lack of progress on the question.

一些哲学家认为,我们可能需要在道德不确定状态下做出决策——即不确定在决策时应采用哪种适当的道德理论。23一种解决方案是为每种道德理论分配一些概率,并根据“预期道德值”做出决策。但目前尚不清楚将概率归因于道德理论是否有意义,就像将概率应用于明天的天气一样。(灭霸完全正确的概率是多少?)即使这有意义,相互竞争的道德理论建议之间可能存在巨大差异,这意味着在我们做出如此重大的决定或将它们委托给机器之前,必须解决道德不确定性——找出哪种道德理论可以避免不可接受的后果。

Some philosophers have argued that we may need to make decisions in a state of moral uncertainty—that is, uncertainty about the appropriate moral theory to employ in making decisions.23 One solution is to allocate some probability to each moral theory and make decisions using an “expected moral value.” It’s not clear, however, that it makes sense to ascribe probabilities to moral theories in the same way one applies probabilities to tomorrow’s weather. (What’s the probability that Thanos is exactly right?) And even if it does make sense, the potentially vast differences between the recommendations of competing moral theories mean that resolving the moral uncertainty—working out which moral theory avoids unacceptable consequences—has to happen before we make such momentous decisions or entrust them to machines.

让我们乐观一点,假设哈丽特最终解决了这个问题,以及由于地球上存在不止一个人而产生的其他问题。适当的利他主义和平等主义算法被下载到世界各地的机器人中。击掌和欢快的音乐响起。然后哈丽特回家了……

Let’s be optimistic and suppose that Harriet eventually solves this and other problems arising from the existence of more than one person on Earth. Suitably altruistic and egalitarian algorithms are downloaded into robots all over the world. Cue the high fives and happy-sounding music. Then Harriet goes home. . . .

罗比:欢迎回家!漫长的一天?

哈丽特:是的,工作非常努力,甚至没有时间吃午饭。

罗比:那你一定很饿了!

哈丽特:饿死了!你能给我做点晚饭吗?

罗比:我有件事要告诉你……

哈里特什么?别告诉我冰箱是空的!

罗比:不,索马里还有更需要帮助的人。我现在要走了。请你们自己做晚饭。

ROBBIE: Welcome home! Long day?

HARRIET: Yes, worked really hard, not even time for lunch.

ROBBIE: So you must be quite hungry!

HARRIET: Starving! Can you make me some dinner?

ROBBIE: There’s something I need to tell you. . . .

HARRIET: What? Don’t tell me the fridge is empty!

ROBBIE: No, there are humans in Somalia in more urgent need of help. I am leaving now. Please make your own dinner.

虽然哈丽特可能对罗比以及她自己为使他成为如此正直和体面的机器所做的贡献感到自豪,但她不禁想知道为什么她要花一大笔钱来买一个机器人,而这个机器人的第一个重要行为就是消失。当然,在实践中,没有人买这样的机器人,所以不会制造这样的机器人,也不会给人类带来任何好处。我们称之为索马里问题。要使整个功利机器人计划发挥作用,我们必须找到解决这个问题的方法。罗比需要对哈丽特有一定程度的忠诚度——也许与哈丽特为罗比支付的金额有关。也许,如果社会希望罗比帮助哈丽特以外的人,社会将需要补偿哈丽特对罗比服务的索求。机器人很可能会相互协调,这样它们就不会同时降临索马里——在这种情况下,罗比可能根本不需要去。或者,也许会出现一些全新的经济关系来处理世界上数十亿纯粹利他主义代理人的存在(这无疑是史无前例的)。

While Harriet might be quite proud of Robbie and of her own contributions towards making him such an upstanding and decent machine, she cannot help but wonder why she shelled out a small fortune to buy a robot whose first significant act is to disappear. In practice, of course, no one would buy such a robot, so no such robots would be built and there would be no benefit to humanity. Let’s call this the Somalia problem. For the whole utilitarian-robot scheme to work, we have to find a solution to this problem. Robbie will need to have some amount of loyalty to Harriet in particular—perhaps an amount related to the amount Harriet paid for Robbie. Possibly, if society wants Robbie to help people besides Harriet, society will need to compensate Harriet for its claim on Robbie’s services. It’s quite likely that robots will coordinate with one another so that they don’t all descend on Somalia at once—in which case, Robbie might not need to go after all. Or perhaps some completely new kinds of economic relationships will emerge to handle the (certainly unprecedented) presence of billions of purely altruistic agents in the world.

善良、邪恶和嫉妒的人类

Nice, Nasty, and Envious Humans

人类的偏好远不止快乐和披萨。它们当然延伸到他人的福祉。即使是经济学之父亚当·斯密(Adam Smith)也经常在需要为自私辩护时被引用,他在他的第一本书的开头就强调了关心他人的重要性:24

Human preferences go far beyond pleasure and pizza. They certainly extend to the well-being of others. Even Adam Smith, the father of economics who is often cited when a justification for selfishness is required, began his first book by emphasizing the crucial importance of concern for others:24

不管人们认为一个人多么自私,他的本性中显然存在着一些原则,这些原则使他对他人的命运感兴趣,并使他人的幸福成为他的必需品,尽管他除了看到他人的幸福之外什么也没有得到。这种同情心或怜悯心就是这种同情心,当我们亲眼看到他人的痛苦,或被迫以非常生动的方式想象他人的痛苦时,我们就会为他人的痛苦而感到的情感。事实上,我们常常从他人的悲伤中得到悲伤,这是显而易见的,不需要任何例子来证明。

How selfish soever man may be supposed, there are evidently some principles in his nature, which interest him in the fortune of others, and render their happiness necessary to him, though he derives nothing from it except the pleasure of seeing it. Of this kind is pity or compassion, the emotion which we feel for the misery of others, when we either see it, or are made to conceive it in a very lively manner. That we often derive sorrow from the sorrow of others, is a matter of fact too obvious to require any instances to prove it.

在现代经济学术语中,关心他人通常属于利他主义。25利他主义理论相当完善利他主义的发展对税收政策等问题有着重大影响。必须指出的是,一些经济学家将利他主义视为另一种形式的自私,旨在为给予者提供“温暖的光芒”。26机器人在解释人类行为时当然需要意识到这种可能性,但现在让我们对人类持怀疑态度,假设他们确实关心。

In modern economic parlance, concern for others usually goes under the heading of altruism.25 The theory of altruism is fairly well developed and has significant implications for tax policy among other matters. Some economists, it must be said, treat altruism as another form of selfishness designed to provide the giver with a “warm glow.”26 This is certainly a possibility that robots need to be aware of as they interpret human behavior, but for now let’s give humans the benefit of the doubt and assume they do actually care.

思考利他主义最简单的方式就是将人的偏好分为两类:对自己内在幸福的偏好和对他人幸福的偏好。(关于是否可以将二者截然分开,存在着相当大的争议,但我将这一争议放在一边。)内在幸福指的是一个人自身生活的品质,如住所、温暖、食物、安全等等,这些品质本身就是令人向往的,而不是指他人生活的品质。

The easiest way to think about altruism is to divide one’s preferences into two kinds: preferences for one’s own intrinsic well-being and preferences concerning the well-being of others. (There is considerable dispute about whether these can be neatly separated, but I’ll put that dispute to one side.) Intrinsic well-being refers to qualities of one’s own life, such as shelter, warmth, sustenance, safety, and so on, that are desirable in themselves rather than by reference to qualities of the lives of others.

为了使这个概念更加具体,我们假设这个世界有两个人,爱丽丝和鲍勃。爱丽丝的总体效用由她自己的内在幸福感加上某个因子C AB乘以鲍勃的内在幸福感组成。关怀因子 C AB表示爱丽丝对鲍勃的关心程度。同样,鲍勃的总体效用由他的内在幸福感加上某个关怀因子C BA乘以爱丽丝的内在幸福感组成,其中C BA表示鲍勃对爱丽丝的关心程度。27罗比试图帮助爱丽丝和鲍勃,这意味着(假设)最大化他们两个效用的总和。因此,罗比不仅需要关注每个人的个人幸福感,还需要关注每个人对彼此幸福感的关心程度。28

To make this notion more concrete, let’s suppose that the world contains two people, Alice and Bob. Alice’s overall utility is composed of her own intrinsic well-being plus some factor CAB times Bob’s intrinsic well-being. The caring factor CAB indicates how much Alice cares about Bob. Similarly, Bob’s overall utility is composed of his intrinsic well-being plus some caring factor CBA times Alice’s intrinsic well-being, where CBA indicates how much Bob cares about Alice.27 Robbie is trying to help both Alice and Bob, which means (let’s say) maximizing the sum of their two utilities. Thus, Robbie needs to pay attention not just to the individual well-being of each but also to how much each cares about the well-being of the other.28

关怀因素C ABC BA符号非常重要。例如,如果C AB为正,则 Alice 是“善良的”:她从 Bob 的幸福中获得了一些快乐。C AB 越正 Alice 就越愿意牺牲自己的一些幸福来帮助 Bob。如果C AB为零,则 Alice 完全自私:如果她能逃脱惩罚,她会将任何数量的资源从 Bob 转移到自己身上,即使 Bob 一贫如洗、挨饿。面对自私的爱丽丝和善良的鲍勃,功利主义的罗比显然会保护鲍勃免受爱丽丝的伤害最严重的掠夺。有趣的是,最终的均衡通常会使鲍勃的内在幸福感低于爱丽丝,但他可能拥有更大的整体幸福感,因为他关心她的幸福感。你可能会觉得,如果罗比的决定仅仅因为鲍勃比爱丽丝好,就让鲍勃的幸福感低于爱丽丝,那么这些决定就非常不公平了:他不会对结果感到不满和不高兴吗?29他可能会,但那将是一个不同的模型——一个包含对幸福感差异的怨恨的术语的模型。在我们的简单模型中,鲍勃会对结果感到平静。事实上,在均衡情况下,他会抵制任何将资源从爱丽丝转移到自己身上的企图,因为这会降低他的整体幸福感。如果你认为这完全不现实,请考虑爱丽丝是鲍勃刚出生的女儿的情况。

The signs of the caring factors CAB and CBA matter a lot. For example, if CAB is positive, Alice is “nice”: she derives some happiness from Bob’s well-being. The more positive CAB is, the more Alice is willing to sacrifice some of her own well-being to help Bob. If CAB is zero, then Alice is completely selfish: if she can get away with it, she will divert any amount of resources away from Bob and towards herself, even if Bob is left destitute and starving. Faced with selfish Alice and nice Bob, a utilitarian Robbie will obviously protect Bob from Alice’s worst depredations. It’s interesting that the final equilibrium will typically leave Bob with less intrinsic well-being than Alice, but he may have greater overall happiness because he cares about her well-being. You might feel that Robbie’s decisions are grossly unfair if they leave Bob with less well-being than Alice merely because he is nicer than she is: Wouldn’t he resent the outcome and be unhappy?29 Well, he might, but that would be a different model—one that includes a term for resentment over differences in well-being. In our simple model Bob would be at peace with the outcome. Indeed, in the equilibrium situation, he would resist any attempt to transfer resources from Alice to himself, since that would reduce his overall happiness. If you think this is completely unrealistic, consider the case where Alice is Bob’s newborn daughter.

对于 Robbie 来说,真正棘手的情况是C AB为负的情况:在这种情况下,Alice 确实很卑鄙。我将使用“消极利他主义”一词来指代此类偏好。与前面提到的虐待狂 Harriet 一样,这与普通的贪婪和自私无关,Alice 乐于减少 Bob 的份额以增加自己的份额。消极利他主义意味着 Alice 纯粹从他人福祉的减少中获得幸福,即使她自己的内在福祉没有改变。

The really problematic case for Robbie to deal with is when CAB is negative: in that case, Alice is truly nasty. I’ll use the phrase negative altruism to refer to such preferences. As with the sadistic Harriet mentioned earlier, this is not about garden-variety greed and selfishness, whereby Alice is content to reduce Bob’s share of the pie in order to enhance her own. Negative altruism means that Alice derives happiness purely from the reduced well-being of others, even if her own intrinsic well-being is unchanged.

在介绍偏好功利主义的论文中,哈萨尼将消极利他主义归因于“虐待狂、嫉妒、怨恨和恶意”,并认为在计算人类总体效用时应该忽略这些因素:

In his paper that introduced preference utilitarianism, Harsanyi attributes negative altruism to “sadism, envy, resentment, and malice” and argues that they should be ignored in calculating the sum total of human utility in a population:

无论我对个人 X 有多少善意,都不能强加道德义务让我帮助他伤害第三人,即个人 Y。

No amount of goodwill to individual X can impose the moral obligation on me to help him in hurting a third person, individual Y.

这似乎是智能机器设计者在正义天平上谨慎地做出决定的一个合理领域。

This seems to be one area in which it is reasonable for the designers of intelligent machines to put a (cautious) thumb on the scales of justice, so to speak.

不幸的是,消极利他主义比单一利他主义更为常见可能预料到。它不是来自虐待狂和恶意30,而是来自嫉妒和怨恨以及它们的反面情绪,我将称之为骄傲(因为没有更好的词)。如果鲍勃嫉妒爱丽丝,他会从爱丽丝的幸福和他自己的幸福之间的差异中获得不快乐;差异越大,他就越不快乐。相反,如果爱丽丝为自己比鲍勃优越而感到自豪,她不仅从自己的内在幸福中获得幸福,而且从自己的幸福高于鲍勃的事实中获得幸福。很容易证明,从数学意义上讲,骄傲和嫉妒的作用方式与虐待狂大致相同;它们导致爱丽丝和鲍勃纯粹从减少彼此的幸福中获得幸福,因为鲍勃幸福的减少会增加爱丽丝的骄傲,而爱丽丝幸福的减少会减少鲍勃的嫉妒。31

Unfortunately, negative altruism is far more common than one might expect. It arises not so much from sadism and malice30 but from envy and resentment and their converse emotion, which I will call pride (for want of a better word). If Bob envies Alice, he derives unhappiness from the difference between Alice’s well-being and his own; the greater the difference, the more unhappy he is. Conversely, if Alice is proud of her superiority over Bob, she derives happiness not just from her own intrinsic well-being but also from the fact that it is higher than Bob’s. It is easy to show that, in a mathematical sense, pride and envy work in roughly the same way as sadism; they lead Alice and Bob to derive happiness purely from reducing each other’s well-being, because a reduction in Bob’s well-being increases Alice’s pride, while a reduction in Alice’s well-being reduces Bob’s envy.31

著名发展经济学家杰弗里·萨克斯曾给我讲过一个故事,说明了这种偏好在人们思维中的力量。当时,孟加拉国的一个地区刚刚遭受了一场大洪水的袭击。萨克斯与一位失去了房屋、田地、所有牲畜和一个孩子的农民交谈。“我很抱歉,你一定非常难过,”萨克斯大胆地说道。“一点也不,”农民回答道。“我很高兴,因为我那该死的邻居也失去了他的妻子和所有的孩子!”

Jeffrey Sachs, the renowned development economist, once told me a story that illustrated the power of these kinds of preferences in people’s thinking. He was in Bangladesh soon after a major flood had devastated one region of the country. He was speaking to a farmer who had lost his house, his fields, all his animals, and one of his children. “I’m so sorry—you must be terribly sad,” Sachs ventured. “Not at all,” replied the farmer. “I’m pretty happy because my damned neighbor has lost his wife and all his children too!”

对骄傲和嫉妒的经济学分析——特别是在社会地位和炫耀性消费的背景下——在美国社会学家托斯丹·凡勃伦的著作中占据了主导地位,他在 1899 年出版的《有闲阶级论》一书中解释了这些态度的有害后果。32 1977年,英国经济学家弗雷德·赫希出版了《增长的社会极限》33在书中,他提出了地位商品的概念。地位商品可以是任何东西——它可以是汽车、房子、奥运奖牌、教育、收入或口音——它的感知价值不仅来自其内在利益,还来自其相对属性,包括稀缺性和优于他人的属性。在骄傲和嫉妒的驱使下,对地位商品的追求具有零和博弈的特征,感觉爱丽丝无法改善自己的相对地位,而不会恶化鲍勃的相对地位,反之亦然。(这似乎并不能阻止巨额资金被浪费在这种追求上。)地位商品似乎在现代生活中无处不在,因此机器需要了解它们在个人偏好中的整体重要性。此外,社会认同理论家提出,群体成员身份和在群体中的地位以及群体相对于其他群体的整体地位是人类自尊的基本要素。34因此,如果不了解个人如何看待自己作为群体成员(无论这些群体是物种、国家、民族、政党、职业、家庭还是某个足球队的支持者),就很难理解人类行为。

The economic analysis of pride and envy—particularly in the context of social status and conspicuous consumption—came to the fore in the work of the American sociologist Thorstein Veblen, whose 1899 book, The Theory of the Leisure Class, explained the toxic consequences of these attitudes.32 In 1977, the British economist Fred Hirsch published The Social Limits to Growth,33 in which he introduced the idea of positional goods. A positional good is anything—it could be a car, a house, an Olympic medal, an education, an income, or an accent—that derives its perceived value not just from its intrinsic benefits but also from its relative properties, including the properties of scarcity and being superior to someone else’s. The pursuit of positional goods, driven by pride and envy, has the character of a zero-sum game, in the sense that Alice cannot improve her relative position without worsening the relative position of Bob, and vice versa. (This doesn’t seem to prevent vast sums being squandered in this pursuit.) Positional goods seem to be ubiquitous in modern life, so machines will need to understand their overall importance in the preferences of individuals. Moreover, social identity theorists propose that membership and standing within a group and the overall status of the group relative to other groups are essential constituents of human self-esteem.34 Thus, it is difficult to understand human behavior without understanding how individuals perceive themselves as members of groups—whether those groups are species, nations, ethnic groups, political parties, professions, families, or supporters of a particular football team.

和虐待狂和恶意一样,我们可以建议罗比在帮助爱丽丝和鲍勃的计划中,不要或根本不要考虑骄傲和嫉妒。然而,这个建议也有一些困难。因为骄傲和嫉妒抵消了爱丽丝对鲍勃幸福的关心,所以把它们区分开来可能并不容易。爱丽丝可能非常关心,但也有嫉妒;很难将这个爱丽丝与另一个只关心一点但一点嫉妒的爱丽丝区分开来。此外,鉴于骄傲和嫉妒在人类偏好中普遍存在,必须非常仔细地考虑忽视它们的后果。它们可能对自尊至关重要,尤其是它们的积极形式——自尊和对他人的钦佩。

As with sadism and malice, we might propose that Robbie should give little or no weight to pride and envy in his plans for helping Alice and Bob. There are some difficulties with this proposal, however. Because pride and envy counteract caring in Alice’s attitude to Bob’s well-being, it may not be easy to tease them apart. It may be that Alice cares a lot, but also suffers from envy; it is hard to distinguish this Alice from a different Alice who cares only a little bit but has no envy at all. Moreover, given the prevalence of pride and envy in human preferences, it’s essential to consider very carefully the ramifications of ignoring them. It might be that they are essential for self-esteem, especially in their positive forms—self-respect and admiration for others.

让我再次强调之前提到的一点:即使机器正在学习虐待狂的偏好,经过适当设计的机器也不会像它们观察到的那样行事。事实上,如果我们人类发现自己每天都处于与纯粹利他主义实体打交道的陌生境地,我们可能会学会成为更好的人——更加利他,更少被骄傲和嫉妒所驱使。

Let me reemphasize a point made earlier: suitably designed machines will not behave like those they observe, even if those machines are learning about the preferences of sadistic demons. It’s possible, in fact, that if we humans find ourselves in the unfamiliar situation of dealing with purely altruistic entities on a daily basis, we may learn to be better people ourselves—more altruistic and less driven by pride and envy.

愚蠢又情绪化的人类

Stupid, Emotional Humans

本节的标题并非针对某一特定群体,而是针对我们所有人。与完美理性所设定的不可企及的标准相比,我们所有人都极其愚蠢,而且我们所有人都受各种情绪的起伏影响,而这些情绪在很大程度上控制着我们的行为。

The title of this section is not meant to refer to some particular subset of humans. It refers to all of us. We are all incredibly stupid compared to the unreachable standard set by perfect rationality, and we are all subject to the ebb and flow of the varied emotions that, to a large extent, govern our behavior.

让我们从愚蠢开始。一个完全理性的实体会最大化其偏好在所有可能的未来生活中的预期满足度。我无法写下一个数字来描述这个决策问题的复杂性,但我发现下面的思想实验很有帮助。首先,请注意,人类一生中做出的运动控制选择的数量约为 20 万亿次。(有关详细计算,请参阅附录 A。)接下来,让我们看看在 Seth Lloyd 的终极物理笔记本电脑的帮助下,蛮力能让我们走多远,这台笔记本电脑比世界上最快的计算机快 10 亿万亿亿倍。我们将给它一个任务,即枚举所有可能的英语单词序列(也许作为 Jorge Luis Borges 的巴别图书馆的热身),我们将让它运行一年。在这段时间内它可以枚举的序列有多长?一千页文本?一百万页?不。十一个单词。这告诉你设计 20 万亿次动作的最佳可能生活的难度。简而言之,我们距离理性还差得远,就像鼻涕虫距离追上以曲速九级飞行的企业号星舰还差得远一样。我们完全不知道理性选择的生活会是什么样子。

Let’s begin with stupidity. A perfectly rational entity maximizes the expected satisfaction of its preferences over all possible future lives it could choose to lead. I cannot begin to write down a number that describes the complexity of this decision problem, but I find the following thought experiment helpful. First, note that the number of motor control choices that a human makes in a lifetime is about twenty trillion. (See Appendix A for the detailed calculations.) Next, let’s see how far brute force will get us with the aid of Seth Lloyd’s ultimate-physics laptop, which is one billion trillion trillion times faster than the world’s fastest computer. We’ll give it the task of enumerating all possible sequences of English words (perhaps as a warmup for Jorge Luis Borges’s Library of Babel), and we’ll let it run for a year. How long are the sequences that it can enumerate in that time? A thousand pages of text? A million pages? No. Eleven words. This tells you something about the difficulty of designing the best possible life of twenty trillion actions. In short, we are much further from being rational than a slug is from overtaking the starship Enterprise traveling at warp nine. We have absolutely no idea what a rationally chosen life would be like.

这意味着人类经常会做出与自己的偏好相反的行为。例如,当李世石在围棋比赛中输给 AlphaGo 时,他下了一步或多步注定会输的棋,而 AlphaGo 可以(至少在某些情况下)检测到他这样做了。然而,AlphaGo 推断李世石倾向于输,这是不正确的。相反,推断李世石倾向于赢,但有一些计算限制使他无法在所有情况下都做出正确的选择。因此,为了理解李世石的行为并了解他的偏好,遵循第三条原则(“人类偏好的最终信息来源是人类行为”)的机器人必须了解一些产生他行为的认知过程。它不能假设他是理性的。

The implication of this is that humans will often act in ways that are contrary to their own preferences. For example, when Lee Sedol lost his Go match to AlphaGo, he played one or more moves that guaranteed he would lose, and AlphaGo could (in some cases at least) detect that he had done this. It would be incorrect, however, for AlphaGo to infer that Lee Sedol has a preference for losing. Instead, it would be reasonable to infer that Lee Sedol has a preference for winning but has some computational limitations that prevent him from choosing the right move in all cases. Thus, in order to understand Lee Sedol’s behavior and learn about his preferences, a robot following the third principle (“the ultimate source of information about human preferences is human behavior”) has to understand something about the cognitive processes that generate his behavior. It cannot assume he is rational.

这给人工智能、认知科学、心理学和神经科学界带来了一个非常严肃的研究问题:充分了解人类认知35,以便我们(或者更确切地说,我们的有益机器)可以“逆向工程”人类行为,以了解深层的潜在偏好,只要它们存在。人类设法做到了其中的一些,在生物学的一点指导下从他人那里学习他们的价值观,所以这似乎是可能的。人类有一个优势:他们可以使用自己的认知架构来模拟其他人的认知架构,而不知道该架构是什么——“如果我想要 X,我会做和妈妈一样的事情,所以妈妈一定想要 X。”

This gives the AI, cognitive science, psychology, and neuroscience communities a very serious research problem: to understand enough about human cognition35 that we (or rather, our beneficial machines) can “reverse-engineer” human behavior to get at the deep underlying preferences, to the extent that they exist. Humans manage to do some of this, learning their values from others with a little bit of guidance from biology, so it seems possible. Humans have an advantage: they can use their own cognitive architecture to simulate that of other humans, without knowing what that architecture is—“If I wanted X, I’d do just the same thing as Mum does, so Mum must want X.”

机器没有这种优势。它们可以轻松模拟其他机器,但不能模拟人类。它们不太可能很快获得完整的人类认知模型,无论是通用的还是针对特定个体的。相反,从实用的角度来看,研究人类偏离理性的主要方式并研究如何从表现出这种偏差的行为中学习偏好是有意义的。

Machines do not have this advantage. They can simulate other machines easily, but not people. It’s unlikely that they will soon have access to a complete model of human cognition, whether generic or tailored to specific individuals. Instead, it makes sense from a practical point of view to look at the major ways in which humans deviate from rationality and to study how to learn preferences from behavior that exhibits such deviations.

人类与理性实体之间的一个明显区别是,在任何特定时刻,我们都不会在所有可能的未来生活的所有可能的第一步中进行选择。差得远呢。相反,我们通常嵌入在“子程序”的深层嵌套层次结构中。一般来说,我们追求的是近期目标,而不是最大化对未来生活的偏好,我们只能根据当前所处的子程序的限制采取行动。例如,现在我正在输入这个句子:我可以选择如何在冒号后继续,但我从来没有想过我是否应该停止写作服刑并参加在线说唱课程或烧毁房屋并申请保险,或者我接下来可以做的无数其他事情。其中许多事情实际上可能比我正在做的事情更好,但是,考虑到我的承诺等级,就好像那些其他事情不存在一样。

One obvious difference between humans and rational entities is that, at any given moment, we are not choosing among all possible first steps of all possible future lives. Not even close. Instead, we are typically embedded in a deeply nested hierarchy of “subroutines.” Generally speaking, we are pursuing near-term goals rather than maximizing preferences over future lives, and we can act only according to the constraints of the subroutine we’re in at present. Right now, for example, I’m typing this sentence: I can choose how to continue after the colon, but it never occurs to me to wonder if I should stop writing the sentence and take an online rap course or burn down the house and claim the insurance or any other of a gazillion things I could do next. Many of these other things might actually be better than what I’m doing, but, given my hierarchy of commitments, it’s as if those other things didn’t exist.

因此,理解人类行为似乎需要理解这个子程序层级(可能非常个性化):当前正在执行哪个子程序,在这个子程序中正在追求哪些近期目标,以及它们与更深层次的长期偏好有何关系。更一般地说,了解人类偏好似乎需要了解人类生活的实际结构。我们人类可以单独或共同从事哪些事情?不同文化和不同类型的人具有哪些活动特征?这些都是非常有趣且要求很高的研究问题。显然,这些问题没有固定的答案,因为我们人类一直在将新的活动和行为结构添加到我们的技能库中。但即使是部分和临时的答案,对于旨在帮助人类日常生活的各种智能系统来说也非常有用。

Understanding human action, then, seems to require understanding this subroutine hierarchy (which may be quite individual): which subroutine the person is executing at present, which near-term objectives are being pursued within this subroutine, and how they relate to deeper, long-term preferences. More generally, learning about human preferences seems to require learning about the actual structure of human lives. What are all the things that we humans can be engaged in, either singly or jointly? What activities are characteristic of different cultures and types of individuals? These are tremendously interesting and demanding research questions. Obviously, they do not have a fixed answer because we humans are adding new activities and behavioral structures to our repertoires all the time. But even partial and provisional answers would be very useful for all kinds of intelligent systems designed to help humans in their daily lives.

人类行为的另一个明显特性是,它们往往受情绪驱动。在某些情况下,这是一件好事——爱和感激等情绪当然是我们的偏好的一部分,受这些情绪引导的行为即使没有经过深思熟虑,也可以是理性的。在其他情况下,情绪反应会导致一些行为,即使是我们这些愚蠢的人类也会意识到这些行为不够理性——当然,是在事后。例如,愤怒和沮丧的哈丽特打了顽固的十岁小女孩爱丽丝一巴掌,她可能会立刻后悔。罗比观察着这一行为,应该(通常,但并非在所有情况下)将这一行为归因于愤怒、沮丧和缺乏自控,而不是蓄意虐待。要做到这一点,罗比必须对人类的情绪状态有一定的了解,包括它们的原因、它们如何随着时间的推移对外部刺激做出反应,以及它们对行为的影响。神经科学家开始掌握一些情绪状态的机制及其与其他认知过程的联系,36并且在检测、预测和操纵人类情绪状态的计算方法方面也有一些有用的工作,37但还有更多需要学习。同样,机器在情绪方面处于劣势:它们无法生成体验的内部模拟来了解它会产生什么样的情绪状态。

Another obvious property of human actions is that they are often driven by emotion. In some cases, this is a good thing—emotions such as love and gratitude are of course partially constitutive of our preferences, and actions guided by them can be rational even if not fully deliberated. In other cases, emotional responses lead to actions that even we stupid humans recognize as less than rational—after the fact, of course. For example, an angry and frustrated Harriet who slaps a recalcitrant ten-year-old Alice may regret the action immediately. Robbie, observing the action, should (typically, although not in all cases) attribute the action to anger and frustration and a lack of self-control rather than deliberate sadism for its own sake. For this to work, Robbie has to have some understanding of human emotional states, including their causes, how they evolve over time in response to external stimuli, and the effects they have on action. Neuroscientists are beginning to get a handle on the mechanics of some emotional states and their connections to other cognitive processes,36 and there is some useful work on computational methods for detecting, predicting, and manipulating human emotional states,37 but there is much more to be learned. Again, machines are at a disadvantage when it comes to emotions: they cannot generate an internal simulation of an experience to see what emotional state it would engender.

情绪不仅影响我们的行为,还能揭示有关我们潜在偏好的有用信息。例如,小爱丽丝可能拒绝做作业,而哈里特感到生气和沮丧,因为她真的希望爱丽丝在学校里表现出色,在生活中比哈里特有更好的机会。如果罗比能够理解这一点——即使他自己无法体验到——他可以从哈里特不理性的行为中学到很多东西。因此,应该可以创建人类情绪状态的基本模型,足以避免从行为推断人类偏好时犯下最严重的错误。

As well as affecting our actions, emotions reveal useful information about our underlying preferences. For example, little Alice may be refusing to do her homework, and Harriet is angry and frustrated because she really wants Alice to do well in school and have a better chance in life than Harriet herself did. If Robbie is equipped to understand this—even if he cannot experience it himself—he may learn a great deal from Harriet’s less-than-rational actions. It ought to be possible, then, to create rudimentary models of human emotional states that suffice to avoid the most egregious errors in inferring human preferences from behavior.

人类真的有偏好吗?

Do Humans Really Have Preferences?

这本书的整个前提是,有些未来是我们想要的,有些未来是我们宁愿避免的,比如近期的灭绝或变成像《黑客帝国》那样的人类电池农场 这个意义上说,是的,人类当然有偏好。然而,一旦我们深入了解人类希望自己的生活如何展开的细节,事情就变得模糊得多。

The entire premise of this book is that there are futures that we would like and futures we would prefer to avoid, such as near-term extinction or being turned into human battery farms à la The Matrix. In this sense, yes, of course humans have preferences. Once we get into the details of how humans would prefer their lives to play out, however, things become much murkier.

不确定性和误差

Uncertainty and error

如果你仔细想想,就会发现人类有一个明显的特性,那就是他们并不总是知道自己想要什么。例如,榴莲会从不同的人那里引发不同的反应:有些人发现“它味道胜过世界上所有其他水果” 38而其他人则将其比作“污水、陈腐的呕吐物、臭鼬喷雾和用过的手术拭子” 。39在发表之前,我有意不去尝试榴莲,以便在这一点上保持中立:我只是不知道我会加入哪一阵营。对于许多考虑未来职业、未来生活伴侣、未来退休后活动等的人来说,情况可能也是如此。

One obvious property of humans, if you think about it, is that they don’t always know what they want. For example, the durian fruit elicits different responses from different people: some find that “it surpasses in flavour all other fruits of the world”38 while others liken it to “sewage, stale vomit, skunk spray and used surgical swabs.”39 I have deliberately refrained from trying durian prior to publication, so that I can maintain neutrality on this point: I simply don’t know which camp I will be in. The same might be said for many people considering future careers, future life partners, future post-retirement activities, and so on.

偏好不确定性至少有两种。第一种是真实的、认知上的不确定性,比如我对榴莲的偏好。40无论怎么想都无法解决这种不确定性。这是经验事实,我可以通过尝尝榴莲、将我的 DNA 与榴莲爱好者和讨厌榴莲的人的 DNA 进行比较等来了解更多信息。第二种不确定性源于计算限制:看两个围棋位置,我不确定我更喜欢哪一个,因为每个位置的后果超出了我完全解决的能力。

There are at least two kinds of preference uncertainty. The first is real, epistemic uncertainty, such as I experience about my durian preference.40 No amount of thought is going to resolve this uncertainty. There is an empirical fact of the matter, and I can find out more by trying some durian, by comparing my DNA with that of durian lovers and haters, and so on. The second arises from computational limitations: looking at two Go positions, I am not sure which I prefer because the ramifications of each are beyond my ability to resolve completely.

不确定性还源于这样一个事实:我们面临的选择通常不完全明确——有时甚至不完全明确,几乎不能算作选择。当艾丽丝即将高中毕业时,职业顾问可能会让她在“图书管理员”和“煤矿工人”之间做出选择;她可能会非常合理地说:“我不确定我更喜欢哪个。”这里的不确定性来自她自己对煤尘和书尘的偏好的认知不确定性;来自计算不确定性,因为她努力想办法让每个职业选择都发挥最佳作用;以及对世界的一般不确定性,比如她对当地煤矿的长期生存能力的怀疑。

Uncertainty also arises from the fact that the choices we are presented with are usually incompletely specified—sometimes so incompletely that they barely qualify as choices at all. When Alice is about to graduate from high school, a career counselor might offer her a choice between “librarian” and “coal miner”; she may, quite reasonably, say, “I’m uncertain about which I prefer.” Here, the uncertainty comes from epistemic uncertainty about her own preferences for, say, coal dust versus book dust; from computational uncertainty as she struggles to work out how she might make the best of each career choice; and from ordinary uncertainty about the world, such as her doubts about the long-term viability of her local coal mine.

出于这些原因,用在描述不完整的选项之间做出简单的选择来识别人类的偏好是一个坏主意,这些选项难以评估,并且包含未知的可取性因素。这样的选择提供了潜在偏好的间接证据,但它们并不是这些偏好的组成部分。这就是为什么我用未来生活来表达偏好的概念——例如,想象你可以以压缩的形式体验,两部关于你未来生活的不同电影,然后表达对这两部电影的偏好(见本页)。这个思想实验当然不可能在实践中进行,但可以想象,在许多情况下,在你填写并充分体验每部电影的所有细节之前,就会出现明确的偏好。即使给你一个情节概要,你也可能事先不知道你会喜欢哪一部;但根据你现在的身份,实际问题一个答案,就像你尝过榴莲后会不会喜欢这个问题有一个答案一样。

For these reasons, it’s a bad idea to identify human preferences with simple choices between incompletely described options that are intractable to evaluate and include elements of unknown desirability. Such choices provide indirect evidence of underlying preferences, but they are not constitutive of those preferences. That’s why I have couched the notion of preferences in terms of future lives—for example by imagining that you could experience, in a compressed form, two different movies of your future life and then express a preference between them (see this page). The thought experiment is of course impossible to carry out in practice, but one can imagine that in many cases a clear preference would emerge long before all the details of each movie had been filled in and fully experienced. You may not know in advance which you will prefer, even given a plot summary; but there is an answer to the actual question, based on who you are now, just as there is an answer to the question of whether you will like durian when you try it.

你可能不确定自己的偏好,但这一事实不会给基于偏好的可证明有益的 AI 方法带来任何特殊问题。事实上,已经有一些算法考虑到了 Robbie 和 Harriet 对 Harriet 偏好的不确定性,并考虑到 Harriet 可能在 Robbie 学习的同时了解自己的偏好。41正如Robbie 对 Harriet 偏好的不确定性可以通过观察 Harriet 的行为来减少一样,Harriet 对自己偏好的不确定性也可以通过观察她自己对体验的反应来减少。这两种不确定性不必直接相关;Robbie 对 Harriet 偏好的不确定性也不一定比 Harriet 更大。例如,Robbie 可能能够检测到 Harriet 具有强烈的遗传倾向,讨厌榴莲的味道。在这种情况下,他对她对榴莲的偏好几乎没有不确定性,即使她完全不知道。

The fact that you might be uncertain about your own preferences does not cause any particular problems for the preference-based approach to provably beneficial AI. Indeed, there are already some algorithms that take into account both Robbie’s and Harriet’s uncertainty about Harriet’s preferences and allow for the possibility that Harriet may be learning about her preferences while Robbie is.41 Just as Robbie’s uncertainty about Harriet’s preferences can be reduced by observing Harriet’s behavior, Harriet’s uncertainty about her own preferences can be reduced by observing her own reactions to experiences. The two kinds of uncertainty need not be directly related; nor is Robbie necessarily more uncertain than Harriet about Harriet’s preferences. For example, Robbie might be able to detect that Harriet has a strong genetic predisposition to despise the flavor of durian. In that case, he would have very little uncertainty about her durian preference, even while she remains completely in the dark.

如果哈丽特不确定自己对未来事件的偏好,那么她很可能也会犯错。例如,她可能确信自己不会喜欢榴莲(或者说,绿鸡蛋和火腿),所以她不惜一切代价避免吃榴莲,但结果可能是——如果有一天有人把榴莲偷偷放进她的水果沙拉里——她还是觉得榴莲很棒。因此,罗比不能假设哈丽特的行为反映了她对自己偏好的准确了解:有些行为可能完全基于经验,而另一些行为可能主要基于假设、偏见、对未知的恐惧或缺乏支持的证据概括。42一个足够机智的罗比可以非常有效地帮助哈丽特警惕此类情况。

If Harriet can be uncertain about her preferences over future events, then, quite probably, she can also be wrong. For example, she might be convinced that she will not like durian (or, say, green eggs and ham) and so she avoids it at all costs, but it may turn out—if someone slips some into her fruit salad one day—that she finds it sublime after all. Thus, Robbie cannot assume that Harriet’s actions reflect accurate knowledge of her own preferences: some may be thoroughly grounded in experience, while others may be based primarily on supposition, prejudice, fear of the unknown, or weakly supported generalizations.42 A suitably tactful Robbie could be very helpful to Harriet in alerting her to such situations.

经验与记忆

Experience and memory

一些心理学家对哈萨尼的偏好自主原则所暗示的“存在一个偏好至高无上的自我”这一观念提出了质疑。这些心理学家中最著名的是我的前伯克利同事丹尼尔·卡尼曼。卡尼曼因其在行为经济学方面的成就而获得了 2002 年诺贝尔奖,他是人类偏好问题上最具影响力的思想家之一。他最近的著作《思考,快与慢43详细叙述了一系列实验,这些实验使他相信存在两个自我——体验自我记忆自我——它们的偏好是相互冲突的。

Some psychologists have called into question the very notion that there is one self whose preferences are sovereign in the way that Harsanyi’s principle of preference autonomy suggests. Most prominent among these psychologists is my former Berkeley colleague Daniel Kahneman. Kahneman, who won the 2002 Nobel Prize for his work in behavioral economics, is one of the most influential thinkers on the topic of human preferences. His recent book, Thinking, Fast and Slow,43 recounts in some detail a series of experiments that convinced him that there are two selves—the experiencing self and the remembering self—whose preferences are in conflict.

体验自我是被享乐计测量的对象,十九世纪英国经济学家弗朗西斯·埃奇沃思认为享乐计是“一种理想完美的仪器,一种心理物理机器,不断记录个人体验到的快乐程度,完全符合意识的判断。” 44根据享乐功利主义,任何体验对个人的整体价值只是体验过程中每个瞬间享乐价值的总和。这一概念同样适用于吃冰淇淋或过一辈子。

The experiencing self is the one being measured by the hedonimeter, which the nineteenth-century British economist Francis Edgeworth imagined to be “an ideally perfect instrument, a psychophysical machine, continually registering the height of pleasure experienced by an individual, exactly according to the verdict of consciousness.”44 According to hedonic utilitarianism, the overall value of any experience to an individual is simply the sum of the hedonic values of each instant during the experience. This notion applies equally well to eating an ice cream or living an entire life.

另一方面,记忆自我则是在需要做出任何决定时“掌控全局”的自我。这个自我会根据对以前经历的记忆及其可取性来选择新的体验。卡尼曼的实验表明,记忆自我的想法与体验自我截然不同。

The remembering self, on the other hand, is the one who is “in charge” when there is any decision to be made. This self chooses new experiences based on memories of previous experiences and their desirability. Kahneman’s experiments suggest that the remembering self has very different ideas from the experiencing self.

最容易理解的实验是将受试者的手浸入冷水中。实验分为两种:第一种,在 14 摄氏度的水中浸泡 60 秒;第二种,在 14 摄氏度的水中浸泡 60 秒,然后在 15 度的温度下浸泡 30 秒。(这些温度与北加州的海洋温度相似——冷到几乎每个人都在水中穿着潜水服。)所有受试者都表示这种体验不愉快。在体验了两种方案后(以任意顺序,中间间隔 7 分钟),受试者被要求选择他们想要重复哪一种。绝大多数受试者更愿意重复 60 + 30 秒的浸泡,而不是仅仅 60 秒的浸泡。

The simplest experiment to understand involves plunging a subject’s hand into cold water. There are two different regimes: in the first, the immersion is for 60 seconds in water at 14 degrees Celsius; in the second, the immersion is for 60 seconds in water at 14 degrees followed by 30 seconds at 15 degrees. (These temperatures are similar to ocean temperatures in Northern California—cold enough that almost everyone wears a wetsuit in the water.) All subjects report the experience as unpleasant. After experiencing both regimes (in either order, with a 7-minute gap in between), the subject is asked to choose which one they would like to repeat. The great majority of subjects prefer to repeat the 60 + 30 rather than just the 60-second immersion.

卡尼曼认为,从体验自我的角度来看,60 + 30 肯定比 60更糟糕,因为前者包括了 60和另一次不愉​​快的经历。然而记忆自我却选择了 60 + 30。为什么呢?

Kahneman posits that, from the point of view of the experiencing self, 60 + 30 has to be strictly worse than 60, because it includes 60 and another unpleasant experience. Yet the remembering self chooses 60 + 30. Why?

卡尼曼的解释是,记忆自我会戴着一种相当奇怪的有色眼镜回顾过去,主要关注“峰值”(最高或最低的享乐价值)和“最终”价值(体验结束时的享乐价值)。体验不同部分的持续时间大多被忽略。60 和 60 + 30 的峰值不适水平相同,但最终水平不同:在 60 + 30 的情况下,水温高出一度。如果记忆自我通过峰值和最终值来评估体验,而不是通过总结一段时间内的享乐价值,那么 60 + 30 更好,这就是发现的结果。峰终模型似乎解释了偏好文献中许多其他同样奇怪的发现。

Kahneman’s explanation is that the remembering self looks back with rather weirdly tinted spectacles, paying attention mainly to the “peak” value (the highest or lowest hedonic value) and the “end” value (the hedonic value at the end of the experience). The durations of different parts of the experience are mostly neglected. The peak discomfort levels for 60 and 60 + 30 are the same, but the end levels are different: in the 60 + 30 case, the water is one degree warmer. If the remembering self evaluates experiences by the peak and end values, rather than by summing up hedonic values over time, then 60 + 30 is better, and this is what is found. The peak-end model seems to explain many other equally weird findings in the literature on preferences.

卡尼曼似乎(或许是恰当的)对他的发现持有两种看法。他断言,记忆自我“只是犯了一个错误”,因为它的记忆有缺陷且不完整,所以选择了错误的体验;他认为这是“对相信选择理性的人来说是个坏消息”。另一方面,他写道:“忽视人们想要什么的幸福理论是无法维持的。”例如,假设哈里特尝试了百事可乐和可口可乐,现在非常喜欢百事可乐;那么根据每次尝试期间秘密进行的快乐指数读数来强迫她喝可口可乐是荒谬的。

Kahneman seems (perhaps appropriately) to be of two minds about his findings. He asserts that the remembering self “simply made a mistake” and chose the wrong experience because its memory is faulty and incomplete; he regards this as “bad news for believers in the rationality of choice.” On the other hand, he writes, “A theory of well-being that ignores what people want cannot be sustained.” Suppose, for example, that Harriet has tried Pepsi and Coke and now strongly prefers Pepsi; it would be absurd to force her to drink Coke based on adding up secret hedonimeter readings taken during each trial.

事实是,没有任何法律要求我们在经历之间的偏好由某一时刻的享乐价值的总和来定义。标准数学模型确实注重最大化奖励总和,45但这样做的最初动机是为了数学上的方便。后来,人们以技术假设的形式提出了理由,在这些假设下,基于奖励总和做出决定是合理的,46但这些技术假设在现实中不一定成立。例如,假设哈里特在两个享乐价值序列之间进行选择:[10,10,10,10,10] 和 [0,0,40,0,0]。她完全有可能更喜欢第二个序列;没有任何数学定律可以强迫她根据总和而不是最大值做出选择。

The fact is that no law requires our preferences between experiences to be defined by the sum of hedonic values over instants of time. It is true that standard mathematical models focus on maximizing a sum of rewards,45 but the original motivation for this was mathematical convenience. Justifications came later in the form of technical assumptions under which it is rational to decide based on adding up rewards,46 but those technical assumptions need not hold in reality. Suppose, for example, that Harriet is choosing between two sequences of hedonic values: [10,10,10,10,10] and [0,0,40,0,0]. It’s entirely possible that she just prefers the second sequence; no mathematical law can force her to make choices based on the sum rather than, say, the maximum.

卡尼曼承认,预期和记忆在幸福感中发挥着至关重要的作用,这让情况变得更加复杂。一次愉快的经历的记忆——一个人的婚礼、孩子的出生、花了一个下午采摘黑莓和做果酱——可以让人度过多年的苦差事和失望。也许记忆自我不仅在评估经历本身,还在评估它对未来记忆的影响,从而评估它对未来人生价值的总体影响。而且,记忆自我,而不是体验自我,才是对记忆内容的最佳判断者。

Kahneman acknowledges that the situation is complicated still further by the crucial role of anticipation and memory in well-being. The memory of a single, delightful experience—one’s wedding day, the birth of a child, an afternoon spent picking blackberries and making jam—can carry one through years of drudgery and disappointment. Perhaps the remembering self is evaluating not just the experience per se but its total effect on life’s future value through its effect on future memories. And presumably it’s the remembering self and not the experiencing self that is the best judge of what will be remembered.

时间与变化

Time and change

几乎不用说,21 世纪的明智之人不会想效仿 2 世纪罗马社会的偏好,那时的罗马社会充斥着为公众娱乐而进行的角斗士屠杀、以奴隶制为基础的经济以及对战败国的残酷屠杀。(我们不必纠结于这些特征在现代社会中的明显相似之处。)随着文明的进步,道德标准显然会随着时间的推移而演变——或者说,随波逐流。这反过来意味着,后代可能会对我们目前对动物福利的态度感到极为厌恶。因此,负责实现人类偏好的机器必须能够对这些偏好的变化做出反应。随着时间的推移,偏好会发生变化,而不是一成不变。第 7 章中的三个原则以自然的方式适应了这种变化,因为它们要求机器学习和实现当前人类的当前偏好——其中有很多,各不相同——而不是单一的理想化偏好集或可能早已去世的机器设计师的偏好。47

It goes almost without saying that sensible people in the twenty-first century would not want to emulate the preferences of, say, Roman society in the second century, replete with gladiatorial slaughter for public entertainment, an economy based on slavery, and brutal massacres of defeated peoples. (We need not dwell on the obvious parallels to these characteristics in modern society.) Standards of morality clearly evolve over time as our civilization progresses—or drifts, if you prefer. This suggests, in turn, that future generations might find utterly repulsive our current attitudes to, say, the well-being of animals. For this reason, it is important that machines charged with implementing human preferences be able to respond to changes in those preferences over time rather than fixing them in stone. The three principles from Chapter 7 accommodate such changes in a natural way, because they require machines to learn and implement the current preferences of current humans—lots of them, all different—rather than a single idealized set of preferences or the preferences of machine designers who may be long dead.47

人类群体的典型偏好可能会随着历史时间而发生变化,这自然会将注意力集中在每个人的偏好是如何形成的以及成人偏好的可塑性问题上。我们的偏好肯定受到生物学的影响:例如,我们通常会避免痛苦、饥饿和口渴。然而,我们的生物学一直保持相当稳定,因此其余的偏好一定是来自文化和家庭的影响。很可能,孩子们一直在进行某种形式的逆向强化学习,以识别父母和同龄人的偏好,以解释他们的行为;然后孩子们将这些偏好作为自己的偏好。即使是成年人,我们的偏好也会通过媒体、政府、朋友、雇主和我们自己的直接经验的影响而演变。例如,许多支持第三帝国的人可能一开始并不是渴望种族纯洁的种族灭绝虐待狂。

The possibility of changes in the typical preferences of human populations over historical time naturally focuses attention on the question of how each individual’s preferences are formed and the plasticity of adult preferences. Our preferences are certainly influenced by our biology: we usually avoid pain, hunger, and thirst, for example. Our biology has remained fairly constant, however, so the remaining preferences must arise from cultural and family influences. Quite possibly, children are constantly running some form of inverse reinforcement learning to identify the preferences of parents and peers in order to explain their behavior; children then adopt these preferences as their own. Even as adults, our preferences evolve through the influence of the media, government, friends, employers, and our own direct experiences. It may be the case, for example, that many supporters of the Third Reich did not start out as genocidal sadists thirsting for racial purity.

偏好变化对个人和社会层面的理性理论提出了挑战。例如,哈萨尼的偏好自主原则似乎表明,每个人都有权享有自己的偏好,其他人不得干涉。然而,偏好并非不可干涉,而是会一直受到个人经历的影响和改变。机器无法不改变人类的偏好,因为机器会改变人类的经历。

Preference change presents a challenge for theories of rationality at both the individual and societal level. For example, Harsanyi’s principle of preference autonomy seems to say that everyone is entitled to whatever preferences they have and no one else should touch them. Far from being untouchable, however, preferences are touched and modified all the time, by every experience a person has. Machines cannot help but modify human preferences, because machines modify human experiences.

区分偏好改变和偏好更新很重要,尽管有时很难,偏好更新发生在最初不确定的哈丽特通过经验更多地了解自己的偏好时。偏好更新可以填补自我认知的空白,也许为之前持有的不坚定和暂时的偏好增加确定性。另一方面,偏好改变并不是由关于一个人的偏好到底是什么的额外证据引起的过程。在极端情况下,你可以想象它是药物给药甚至脑部手术的结果——它发生在我们可能不理解或不同意的过程中。

It’s important, although sometimes difficult, to separate preference change from preference update, which occurs when an initially uncertain Harriet learns more about her own preferences through experience. Preference update can fill in gaps in self-knowledge and perhaps add definiteness to preferences that were previously weakly held and provisional. Preference change, on the other hand, is not a process that results from additional evidence about what one’s preferences actually are. In the extreme case, you can imagine it as resulting from drug administration or even brain surgery—it occurs from processes we may not understand or agree with.

偏好改变至少有两个问题。第一个原因是,在做决定时,不清楚哪些偏好应该占主导地位:哈丽特在做决定时的偏好,还是她在决定导致的事件期间和之后的偏好。例如,在生物伦理学中,这是一个非常现实的困境,因为人们对医疗干预和临终关怀的偏好在他们患上重病后确实会发生改变,而且往往是巨大的改变。48假设这些变化不是由智力下降引起的,那么应该尊重谁的偏好?49

Preference change is problematic for at least two reasons. The first reason is that it’s not clear which preferences should hold sway when making a decision: the preferences that Harriet has at the time of the decision or the preferences that she will have during and after the events that result from her decision. In bioethics, for example, this is a very real dilemma because people’s preferences about medical interventions and end-of-life care do change, often dramatically, after they become seriously ill.48 Assuming these changes do not result from diminished intellectual capacity, whose preferences should be respected?49

偏好改变成问题的第二个原因是,似乎没有明显的理性基础来改变(而不是更新)一个人的偏好。如果哈里特更喜欢 A 而不是 B,但她可以选择经历一种她知道会导致她更喜欢 B 而不是 A 的经历,她为什么要这么做呢?结果是她会选择 B,而 B 是她目前不想要的。

The second reason that preference change is problematic is that there seems to be no obvious rational basis for changing (as opposed to updating) one’s preferences. If Harriet prefers A to B, but could choose to undergo an experience that she knows will result in her preferring B to A, why would she ever do that? The outcome would be that she would then choose B, which she currently does not want.

偏好变化问题在尤利西斯和塞壬的传说中以戏剧性的形式出现。塞壬是神话中的生物,她们的歌声会将水手引诱到地中海某些岛屿的岩石上,让他们自取灭亡。尤利西斯想听到这首歌,于是命令水手们用蜡堵住耳朵,把他绑在桅杆上;无论如何,他们都不能听从他后来的恳求,把他释放。显然,他希望水手们尊重他最初的偏好,而不是他在被塞壬迷惑后会拥有的偏好。这个传说成为了挪威哲学家琼·埃尔斯特 (Jon Elster) 的一本书的标题,50这本书讨论了意志薄弱以及对理性理论思想的其他挑战。

The issue of preference change appears in dramatic form in the legend of Ulysses and the Sirens. The Sirens were mythical beings whose singing lured sailors to their doom on the rocks of certain islands in the Mediterranean. Ulysses, wishing to hear the song, ordered his sailors to plug their ears with wax and to bind him to the mast; under no circumstances were they to obey his subsequent entreaties to release him. Obviously, he wanted the sailors to respect the preferences he had initially, not the preferences he would have after the Sirens bewitch him. This legend became the title of a book by the Norwegian philosopher Jon Elster,50 dealing with weakness of will and other challenges to the theoretical idea of rationality.

为什么智能机器会刻意改变人类的偏好?答案很简单:让偏好更容易满足。我们在第 1 章中通过社交媒体点击率优化案例看到了这一点。一种回应可能是说,机器必须将人类的偏好视为神圣不可侵犯的:任何东西都不能改变人类的偏好。不幸的是,这完全是不可能的。一个有用的机器人助手的存在很可能会对人类的偏好产生影响。

Why might an intelligent machine deliberately set out to modify the preferences of humans? The answer is quite simple: to make the preferences easier to satisfy. We saw this in Chapter 1 with the case of social-media click-through optimization. One response might be to say that machines must treat human preferences as sacrosanct: nothing can be allowed to change the human’s preferences. Unfortunately, this is completely impossible. The very existence of a useful robot aide is likely to have an effect on human preferences.

一种可能的解决方案是让机器了解人类的元偏好——即关于哪些类型的偏好变化过程可能是可接受的或不可接受的偏好。请注意这里使用的是“偏好变化过程”而不是“偏好变化”。这是因为希望一个人的偏好朝着特定的方向改变往往意味着已经拥有了这种偏好;在这种情况下真正想要的是更好地实现偏好的能力。例如,如果哈里特说:“我希望我的偏好发生改变,这样我就不会像现在这样想要蛋糕了”,那么她已经对未来少吃蛋糕有了偏好;她真正想要的是改变她的认知架构,使她的行为更贴近这种偏好。

One possible solution is for machines to learn about human meta-preferences—that is, preferences about what kinds of preference change processes might be acceptable or unacceptable. Notice the use of “preference change processes” rather than “preference changes” here. That’s because wanting one’s preferences to change in a specific direction often amounts to having that preference already; what’s really wanted in such a case is the ability to be better at implementing the preference. For example, if Harriet says, “I want my preferences to change so that I don’t want cake as much as I do now,” then she already has a preference for a future with less cake consumption; what she really wants is to alter her cognitive architecture so that her behavior more closely reflects that preference.

所谓“关于哪些类型的偏好改变过程是可接受或不可接受的偏好”,我指的是,例如,人们可能通过周游世界、体验各种文化,或通过参加一个充满活力的知识社区,彻底探索各种道德传统,或通过留出一些隐居时间来内省和认真思考生活及其意义,最终获得“更好”的偏好。我将这些过程称为偏好中立,意思是人们不会预期这个过程会将自己的偏好改变到任何特定的方向,同时也承认有些人可能强烈反对这种描述。

By “preferences about what kinds of preference change processes might be acceptable or unacceptable,” I mean, for example, a view that one may end up with “better” preferences by traveling the world and experiencing a wide variety of cultures, or by participating in a vibrant intellectual community that thoroughly explores a wide range of moral traditions, or by setting aside some hermit time for introspection and hard thinking about life and its meaning. I’ll call these processes preference-neutral, in the sense that one does not anticipate that the process will change one’s preferences in any particular direction, while recognizing that some may strongly disagree with that characterization.

当然,并非所有偏好中立的过程都是可取的——例如,很少有人期望通过以下方式发展出“更好”的偏好:让自己接受一个可接受的偏好改变过程,就好比进行一个实验来了解世界是如何运转的:你永远无法提前知道实验结果如何,但你仍然期望在新的精神状态下会更好。

Of course, not all preference-neutral processes are desirable— for example, few people expect to develop “better” preferences by whacking themselves on the head. Subjecting oneself to an acceptable process of preference change is analogous to running an experiment to find out something about how the world works: you never know in advance how the experiment will turn out, but you expect, nonetheless, to be better off in your new mental state.

存在可接受的偏好改变途径这一观点似乎与存在可接受的行为改变方法这一观点相关,例如,雇主设计选择情况,以便人们在退休储蓄方面做出“更好”的选择。这通常可以通过操纵影响选择的“非理性”因素来实现,而不是通过限制选择或对“坏”选择征税。经济学家理查德·塞勒和法律学者卡斯·桑斯坦合著的《助推》一书列出了一系列据称可接受的方法和机会,以“影响人们的行为,使他们的生活更长寿、更健康、更美好”。

The idea that there are acceptable routes to preference modification seems related to the idea that there are acceptable methods of behavior modification whereby, for example, an employer engineers the choice situation so that people make “better” choices about saving for retirement. Often this can be done by manipulating the “non-rational” factors that influence choice, rather than by restricting choices or taxing “bad” choices. Nudge, a book by economist Richard Thaler and legal scholar Cass Sunstein, lays out a wide range of supposedly acceptable methods and opportunities to “influence people’s behavior in order to make their lives longer, healthier, and better.”

目前尚不清楚行为矫正方法是否真的只是改变行为。如果在移除助推后,改变后的行为仍然存在——这可能是此类干预的预期结果——那么个人的认知结构(将潜在偏好转化为行为的东西)或个人的潜在偏好发生了变化。很可能两者兼而有之。然而,很明显的是,助推策略假设每个人都有“更长寿、更健康、更好”的偏好;每一次助推都基于对“更好”生活的特定定义,这似乎违背了偏好自主的本质。相反,更好的做法可能是设计偏好中立的辅助过程,帮助人们将他们的决策和认知结构更好地与潜在偏好保持一致。例如,可以设计认知辅助工具,突出决策的长期后果,并教人们在当下识别这些后果的种子。51

It’s unclear whether behavior modification methods are really just modifying behavior. If, when the nudge is removed, the modified behavior persists—which is presumably the desired outcome of such interventions—then something has changed in the individual’s cognitive architecture (the thing that turns underlying preferences into behavior) or in the individual’s underlying preferences. It’s quite likely to be a bit of both. What is clear, however, is that the nudge strategy is assuming that everyone shares a preference for “longer, healthier, and better” lives; each nudge is based on a particular definition of a “better” life, which seems to go against the grain of preference autonomy. It might be better, instead, to design preference-neutral assistive processes that help people bring their decisions and their cognitive architectures into better alignment with their underlying preferences. For example, it’s possible to design cognitive aides that highlight the longer-term consequences of decisions and teach people to recognize the seeds of those consequences in the present.51

我们需要更好地理解人类偏好形成和塑造的过程,这一点显而易见,尤其是因为这样的理解有助于我们设计出避免人类偏好发生意外和不良变化的机器,就像社交媒体内容选择算法所造成的那样。有了这样的理解,我们当然会倾向于设计出可以创造“更好”世界的变革。

That we need a better understanding of the processes whereby human preferences are formed and shaped seems obvious, not least because such an understanding would help us design machines that avoid accidental and undesirable changes in human preferences of the kind wrought by social-media content selection algorithms. Armed with such an understanding, of course, we will be tempted to engineer changes that would result in a “better” world.

有些人可能会认为,我们应该提供更多机会,让学生体验偏好中立的“提高”体验,比如旅行、辩论以及分析和批判性思维训练。例如,我们可以为每个高中生提供机会,让他们至少在两种不同于自己文化的文化中生活几个月。

Some might argue that we should provide much greater opportunities for preference-neutral “improving” experiences such as travel, debate, and training in analytical and critical thinking. We might, for example, provide opportunities for every high-school student to live for a few months in at least two other cultures distinct from his or her own.

然而,几乎可以肯定的是,我们想要更进一步——例如,通过进行社会和教育改革,提高利他主义系数(即每个人对他人的福祉的重视程度),同时降低虐待狂、傲慢和嫉妒的系数。这是一个好主意吗?我们应该招募机器来协助这一过程吗?这当然很诱人。事实上,亚里士多德本人写道:“政治的主要关注点是培养公民的某种性格,使他们善良并愿意做出高尚的行为。”我们只能说,在全球范围内有意进行偏好工程存在风险。我们应该极其谨慎地行事。

Almost certainly, however, we will want to go further—for example, by instituting social and educational reforms that increase the coefficient of altruism—the weight that each individual places on the welfare of others—while decreasing the coefficients of sadism, pride, and envy. Would this be a good idea? Should we recruit our machines to help in the process? It’s certainly tempting. Indeed, Aristotle himself wrote, “The main concern of politics is to engender a certain character in the citizens and to make them good and disposed to perform noble actions.” Let’s just say that there are risks associated with intentional preference engineering on a global scale. We should proceed with extreme caution.

10

10

问题解决了吗?

PROBLEM SOLVED?

如果我们成功创建可证明有益的人工智能系统,我们将消除失去对超级智能机器控制的风险。人类可以继续发展它们,并从利用更高智能推进文明的能力中获得几乎难以想象的好处。我们将摆脱数千年来作为农业、工业和文职机器人的奴役,自由地发挥生命的最大潜力。从这个黄金时代的有利位置,我们将回顾我们现在的生活,就像托马斯·霍布斯想象的没有政府的生活一样:孤独、贫穷、肮脏、野蛮和短暂。

If we succeed in creating provably beneficial AI systems, we would eliminate the risk that we might lose control over superintelligent machines. Humanity could proceed with their development and reap the almost unimaginable benefits that would flow from the ability to wield far greater intelligence in advancing our civilization. We would be released from millennia of servitude as agricultural, industrial, and clerical robots and we would be free to make the best of life’s potential. From the vantage point of this golden age, we would look back on our lives in the present time much as Thomas Hobbes imagined life without government: solitary, poor, nasty, brutish, and short.

或许并非如此。邦德式的恶棍可能会绕过我们的安全措施,释放出人类无法防御的无法控制的超级智能。如果我们能挺过这一关,我们可能会发现自己会逐渐衰弱,因为我们将越来越多的知识和技能托付给机器。机器可能会建议我们不要这样做,因为它们理解人类自主的长期价值,但我们可能会推翻它们。

Or perhaps not. Bondian villains may circumvent our safeguards and unleash uncontrollable superintelligences against which humanity has no defense. And if we survive that, we may find ourselves gradually enfeebled as we entrust more and more of our knowledge and skills to machines. The machines may advise us not to do this, understanding the long-term value of human autonomy, but we may overrule them.

有益的机器

Beneficial Machines

20 世纪大量技术所依赖的标准模型依赖于优化固定的、外生提供的目标的机制。正如我们所见,这种模型从根本上是有缺陷的。它只有在保证目标完整正确,或者机制可以轻松重置的情况下才有效。随着人工智能变得越来越强大,这两种情况都不会成立。

The standard model underlying a good deal of twentieth-century technology relies on machinery that optimizes a fixed, exogenously supplied objective. As we have seen, this model is fundamentally flawed. It works only if the objective is guaranteed to be complete and correct, or if the machinery can easily be reset. Neither condition will hold as AI becomes increasingly powerful.

如果外部提供的目标可能是错误的,那么让机器表现得好像它总是正确的一样,这是没有意义的。因此,我建议使用有益的机器:机器的行为可以实现我们的目标。因为这些目标在我们身上,而不是在他们身上,所以机器需要通过观察我们所做的选择以及我们如何做出这些选择来更多地了解我们真正想要什么。以这种方式设计的机器将听从人类:它们会征求许可;当指导不明确时,它们会谨慎行事;它们会允许自己被关闭。

If the exogenously supplied objective can be wrong, then it makes no sense for the machine to act as if it is always correct. Hence my proposal for beneficial machines: machines whose actions can be expected to achieve our objectives. Because these objectives are in us, and not in them, the machines will need to learn more about what we really want from observations of the choices we make and how we make them. Machines designed in this way will defer to humans: they will ask permission; they will act cautiously when guidance is unclear; and they will allow themselves to be switched off.

虽然这些初步结果适用于简化和理想化的设置,但我相信它们将在过渡到更现实的设置后继续存在。我的同事已经成功地将同样的方法应用于实际问题,例如自动驾驶汽车与人类驾驶员的交互。1例如,当不清楚谁有优先通行权时,自动驾驶汽车在处理四向停车标志方面表现非常糟糕。然而,通过将此作为辅助游戏,汽车想出了一个新颖的解决方案:它实际上会稍微倒车,以表明它绝对不打算先行。人理解这个信号并继续前进,相信不会发生碰撞。显然,我们人类专家可以想到这个解决方案并将其编程到车辆中,但事实并非如此;这是汽车完全自己发明的一种沟通形式。

While these initial results are for a simplified and idealized setting, I believe they will survive the transition to more realistic settings. Already, my colleagues have successfully applied the same approach to practical problems such as self-driving cars interacting with human drivers.1 For example, self-driving cars are notoriously bad at handling four-way stop signs when it’s not clear who has the right of way. By formulating this as an assistance game, however, the car comes up with a novel solution: it actually backs up a little bit to show that it’s definitely not planning to go first. The human understands this signal and goes ahead, confident that there will be no collision. Obviously, we human experts could have thought of this solution and programmed it into the vehicle, but that’s not what happened; this is a form of communication that the vehicle invented entirely by itself.

随着我们在其他环境中获得更多经验,我预计,机器与人类互动时的行为范围和流畅性会令我们感到惊讶。我们已经习惯了机器的愚蠢,它们执行不灵活的、预先编程的行为,或追求明确但不正确的目标,我们可能会对它们变得如此明智感到震惊。可证明有益的机器技术是人工智能新方法的核心,也是人机新关系的基础。

As we gain more experience in other settings, I expect that we will be surprised by the range and fluency of machine behaviors as they interact with humans. We are so used to the stupidity of machines that execute inflexible, preprogrammed behaviors or pursue definite but incorrect objectives that we may be shocked by how sensible they become. The technology of provably beneficial machines is the core of a new approach to AI and the basis for a new relationship between humans and machines.

似乎也可以将类似的想法应用于其他应该为人类服务的“机器”的重新设计,从普通软件系统开始。我们被教导通过编写子程序来构建软件,每个子程序都有一个明确定义的规范,说明对于任何给定的输入应该输出什么——就像计算器上的平方根按钮一样。这个规范直接模拟了人工智能系统的目标。子程序不应该终止并将控制权返回给软件系统的更高层,直到它产生符合规范的输出。(这应该让你想起人工智能系统,它坚持一心一意地追求它的目标。)更好的方法是允许规范中的不确定性。例如,执行一些极其复杂的数学计算的子程序通常会被赋予一个错误界限,该界限定义了答案所需的精度,并且必须返回一个在该错误界限内的正确解决方案。有时,这可能需要数周的计算。相反,最好不要太精确地定义允许的误差,这样子程序可以在 20 秒后返回并说:“我找到了这么好的解决方案。这样可以吗,还是你想让我继续?”在某些情况下,问题可能会一直渗透到软件系统的顶层,这样人类用户可以为系统提供进一步的指导。人类的回答将有助于完善各个级别的规范。

It seems possible, also, to apply similar ideas to the redesign of other “machines” that ought to be serving humans, beginning with ordinary software systems. We are taught to build software by composing subroutines, each of which has a well-defined specification that says what the output should be for any given input—just like the square-root button on a calculator. This specification is the direct analog of the objective given to an AI system. The subroutine is not supposed to terminate and return control to the higher layers of the software system until it has produced an output that meets the specification. (This should remind you of the AI system that persists in its single-minded pursuit of its given objective.) A better approach would be to allow for uncertainty in the specification. For example, a subroutine that carries out some fearsomely complicated mathematical computation is typically given an error bound that defines the required precision for the answer and has to return a solution that is correct within that error bound. Sometimes, this may require weeks of computation. Instead, it might be better to be less precise about the allowed error, so that the subroutine could come back after twenty seconds and say, “I’ve found a solution that’s this good. Is that OK or do you want me to continue?” In some cases, the question may percolate all the way to the top level of the software system, so that the human user can provide further guidance to the system. The human’s answers would then help in refining the specifications at all levels.

同样的思维方式也适用于政府和企业等实体。政府的明显缺陷包括过于关注政府官员的偏好(经济和政治偏好),而对被统治者的偏好关注太少。选举本应向政府传达偏好,但对于如此复杂的任务,选举似乎只有非常小的带宽(大约每隔几年一个字节的信息量)。在太多国家,政府只是一群人将其意志强加于他人的一种手段。企业会尽最大努力了解客户的偏好,无论是通过市场调查还是以购买决策的形式直接反馈。另一方面,通过广告、文化影响甚至化学成瘾来塑造人类偏好是一种可接受的商业方式。

The same kind of thinking can be applied to entities such as governments and corporations. The obvious failings of government include paying too much attention to the preferences (financial as well as political) of those in government and too little attention to the preferences of the governed. Elections are supposed to communicate preferences to the government, but they seem to have a remarkably small bandwidth (on the order of one byte of information every few years) for such a complex task. In far too many countries, government is simply a means for one group of people to impose its will on others. Corporations go to greater lengths to learn the preferences of customers, whether through market research or direct feedback in the form of purchase decisions. On the other hand, the molding of human preferences through advertising, cultural influences, and even chemical addiction is an accepted way of doing business.

人工智能治理

Governance of AI

人工智能有能力重塑世界,而重塑世界的过程必须以某种方式进行管理和引导。如果制定有效治理人工智能的举措数量可以作为参考,那么我们的情况非常好。每个人都在成立董事会、理事会或国际小组。世界经济论坛已经确定了近三百项为制定人工智能道德原则而做出的独立努力。我的电子邮箱可以概括为一封长长的邀请函,邀请我参加全球世界峰会,主题是“新兴人工智能技术的社会和道德影响的国际治理未来”。

AI has the power to reshape the world, and the process of reshaping will have to be managed and guided in some way. If the sheer number of initiatives to develop effective governance of AI is any guide, then we are in excellent shape. Everyone and their uncle is setting up a Board or a Council or an International Panel. The World Economic Forum has identified nearly three hundred separate efforts to develop ethical principles for AI. My email inbox can be summarized as one long invitation to the Global World Summit Conference Forum on the Future of International Governance of the Social and Ethical Impacts of Emerging Artificial Intelligence Technologies.

这与核技术的发展完全不同。二战后,美国掌握着所有的核牌。1953年,美国总统艾森豪威尔向联合国提议成立一个国际机构来监管核技术。1957年,国际原子能机构开始运作;它是全球唯一一个负责安全、有益地开发核能的机构。

This is all very different from what happened with nuclear technology. After World War II, the United States held all the nuclear cards. In 1953, US president Dwight Eisenhower proposed to the UN an international body to regulate nuclear technology. In 1957, the International Atomic Energy Agency started work; it is the sole global overseer for the safe and beneficial development of nuclear energy.

相比之下,许多人手里都拿着人工智能牌。可以肯定的是,美国美国、中国和欧盟为人工智能研究提供了大量资金,但几乎所有研究都在安全的国家实验室之外进行。大学的人工智能研究人员是一个广泛合作的国际社会的一部分,这个国际社会通过共同的利益、会议、合作协议和专业协会(如 AAAI(人工智能促进协会)和 IEEE(电气和电子工程师协会,其中包括数万名人工智能研究人员和从业者))将他们紧密联系在一起。目前,人工智能研发的大部分投资可能都发生在大大小小的公司内部;截至 2019 年,领先的参与者是美国的谷歌(包括 DeepMind)、Facebook、亚马逊、微软和 IBM,以及中国的腾讯、百度和阿里巴巴(某种程度上)——它们都是世界上最大的公司。2除了腾讯和阿里巴巴之外,其他所有公司都是人工智能伙伴关系组织的成员,这是一个行业联盟,其宗旨之一就是承诺在人工智能安全方面进行合作。最后,尽管绝大多数人类缺乏人工智能专业知识,但其他参与者至少表面上愿意考虑人类的利益。

In contrast, many hands hold AI cards. To be sure, the United States, China, and the EU fund a lot of AI research, but almost all of it occurs outside secure national laboratories. AI researchers in universities are part of a broad, cooperative international community, glued together by shared interests, conferences, cooperative agreements, and professional societies such as AAAI (the Association for the Advancement of Artificial Intelligence) and IEEE (the Institute of Electrical and Electronics Engineers, which includes tens of thousands of AI researchers and practitioners). Probably the majority of investment in AI research and development is now occurring within corporations, large and small; the leading players as of 2019 are Google (including DeepMind), Facebook, Amazon, Microsoft, and IBM in the United States and Tencent, Baidu, and, to some extent, Alibaba in China—all among the largest corporations in the world.2 All but Tencent and Alibaba are members of the Partnership on AI, an industry consortium that includes among its tenets a promise of cooperation on AI safety. Finally, although the vast majority of humans possess little in the way of AI expertise, there is at least a superficial willingness among other players to take the interests of humanity into account.

因此,这些是掌握大多数牌的玩家。他们的利益并不完全一致,但都希望在人工智能系统变得更加强大时保持对它们的控制。(其他目标,例如避免大规模失业,是政府和大学研究人员共同的目标,但不一定是由希望在短期内从尽可能广泛的人工智能部署中获利的公司共同实现的。)为了巩固这种共同利益并实现协调行动,有一些组织具有召集权,这大致意味着,如果组织安排了会议,人们就会接受邀请参加。除了可以将人工智能研究人员聚集在一起的专业协会和将公司和非营利机构结合起来的人工智能伙伴关系之外,典型的召集人是联合国(代表政府和研究人员)和世界经济论坛(代表政府和公司)。此外,七国集团还提议成立一个国际人工智能小组情报部门希望它能发展成为类似联合国政府间气候变化专门委员会的组织。听起来很重要的报告如雨后春笋般涌现。

These, then, are the players who hold the majority of the cards. Their interests are not in perfect alignment but all share a desire to maintain control over AI systems as they become more powerful. (Other goals, such as avoiding mass unemployment, are shared by governments and university researchers, but not necessarily by corporations that expect to profit in the short term from the widest possible deployment of AI.) To cement this shared interest and achieve coordinated action, there are organizations with convening power, which means, roughly, that if the organization sets up a meeting, people accept the invitation to participate. In addition to the professional societies, which can bring AI researchers together, and the Partnership on AI, which combines corporations and nonprofit institutes, the canonical conveners are the UN (for governments and researchers) and the World Economic Forum (for governments and corporations). In addition, the G7 has proposed an International Panel on Artificial Intelligence, hoping that it will grow into something like the UN’s Intergovernmental Panel on Climate Change. Important-sounding reports are multiplying like rabbits.

有了这些活动,治理是否有可能取得实际进展?答案或许令人惊讶,至少在边缘上是肯定的。世界上许多政府都在建立咨询机构,以协助制定法规;也许最突出的例子是欧盟人工智能高级专家组。针对用户隐私、数据交换和避免种族偏见等问题的协议、规则和标准开始出现。政府和企业正在努力制定自动驾驶汽车的规则——这些规则不可避免地会涉及跨境元素。人们一致认为,如果要信任人工智能系统,人工智能决策必须是可解释的,而这一共识已在欧盟的 GDPR 立法中得到部分实施。在加利福尼亚州,一项新法律禁止人工智能系统在某些情况下冒充人类。最后两项——可解释性和冒充性——肯定与人工智能安全和控制问题有关。

With all this activity, is there any prospect of actual progress on governance occurring? Perhaps surprisingly, the answer is yes, at least around the edges. Many governments around the world are equipping themselves with advisory bodies to help with the process of developing regulations; perhaps the most prominent example is the EU’s High-Level Expert Group on Artificial Intelligence. Agreements, rules, and standards are beginning to emerge for issues such as user privacy, data exchange, and avoiding racial bias. Governments and corporations are working hard to sort out the rules for self-driving cars—rules that will inevitably have cross-border elements. There is a consensus that AI decisions must be explainable if AI systems are to be trusted, and that consensus is already partially implemented in the EU’s GDPR legislation. In California, a new law forbids AI systems to impersonate humans in certain circumstances. These last two items—explainability and impersonation—certainly have some bearing on issues of AI safety and control.

目前,对于考虑保持对人工智能系统控制权问题的政府或其他组织,尚无可实施的建议。诸如“人工智能系统必须安全可控”之类的规定毫无意义,因为这些术语尚未具有确切含义,而且也没有广为人知的确保安全性和可控性的工程方法。但让我们乐观地想象,几年后,人工智能“可证明有益”方法的有效性已通过数学分析和以有用应用形式实现的实际实现得到证实。例如,我们可能会拥有可以信赖的个人数字助理来使用我们的信用卡、筛选我们的电话和电子邮件以及管理我们的财务,因为它们已经适应了我们的个人偏好,知道何时可以继续,何时最好寻求指导。我们的自动驾驶汽车可能已经学会了与他人和人类司机互动的良好礼仪,而我们的家用机器人应该能够与最顽皮的幼儿顺利互动。幸运的话,不会有猫被烤来当晚餐,也不会有鲸鱼肉被端给绿党成员。

At present, there are no implementable recommendations that can be made to governments or other organizations considering the issue of maintaining control over AI systems. A regulation such as “AI systems must be safe and controllable” would carry no weight, because these terms do not yet have precise meanings and because there is no widely known engineering methodology for ensuring safety and controllability. But let’s be optimistic and imagine that, a few years down the line, the validity of the “provably beneficial” approach to AI has been established through both mathematical analysis and practical realization in the form of useful applications. We might, for example, have personal digital assistants that we can trust to use our credit cards, screen our calls and emails, and manage our finances because they have adapted to our individual preferences and know when it’s OK to go ahead and when it’s better to ask for guidance. Our self-driving cars may have learned good manners for interacting with one another and with human drivers, and our domestic robots should be interacting smoothly with even the most recalcitrant toddler. With luck, no cats will have been roasted for dinner and no whale meat will have been served to members of the Green Party.

此时,指定软件设计模板可能是可行的,各种应用程序必须符合这些模板才能出售或连接到互联网,就像应用程序必须通过一系列软件测试才能在 Apple 的 App Store 或 Google Play 上出售一样。软件供应商可以提出其他模板,只要它们能证明模板满足(当时已明确定义的)安全性和可控性要求即可。将有机制来报告问题并更新产生不良行为的软件系统。围绕可证明安全的人工智能程序这一理念制定专业行为准则并将相应的定理和方法整合到有抱负的人工智能和机器学习从业者的课程中也是有意义的。

At that point, it might be feasible to specify software design templates to which various kinds of applications must conform in order to be sold or connected to the Internet, just as applications have to pass a number of software tests before they can be sold on Apple’s App Store or Google Play. Software vendors could propose additional templates, as long as they come with proofs that the templates satisfy the (by then well-defined) requirements of safety and controllability. There would be mechanisms for reporting problems and for updating software systems that produce undesirable behavior. It would make sense also to create professional codes of conduct around the idea of provably safe AI programs and to integrate the corresponding theorems and methods into the curriculum for aspiring AI and machine learning practitioners.

对于经验丰富的硅谷观察家来说,这可能听起来相当幼稚。硅谷强烈反对任何形式的监管。虽然我们习惯于这样的想法,即制药公司必须通过临床试验证明其安全性和(有益的)功效,然后才能向公众发布产品,但软件行业遵循的是另一套规则——即空集。软件公司的“一群喝着红牛” 3 的人可以发布一款产品或升级,影响数十亿人,而无需任何第三方监督。

To a seasoned observer of Silicon Valley, this may sound rather naïve. Regulation of any kind is strenuously opposed in the Valley. Whereas we are accustomed to the idea that pharmaceutical companies have to show safety and (beneficial) efficacy through clinical trials before they can release a product to the general public, the software industry operates by a different set of rules—namely, the empty set. A “bunch of dudes chugging Red Bull”3 at a software company can unleash a product or an upgrade that affects literally billions of people with no third-party oversight whatsoever.

然而,科技行业不可避免地必须承认其产品很重要;如果它们很重要,那么产品不产生有害影响也很重要。这意味着将有规则来管理与人类互动的性质,禁止那些持续操纵偏好或产生上瘾行为的设计。我毫不怀疑,从从不受监管的世界走向受监管的世界将是一个痛苦的过程。但愿我们不需要像切尔诺贝利那样的灾难(或更糟的灾难)来克服这个行业的阻力。

Inevitably, however, the tech industry is going to have to acknowledge that its products matter; and, if they matter, then it matters that the products not have harmful effects. This means that there will be rules governing the nature of interactions with humans, prohibiting designs that, say, consistently manipulate preferences or produce addictive behavior. I have no doubt that the transition from an unregulated to a regulated world will be a painful one. Let’s hope it doesn’t require a Chernobyl-sized disaster (or worse) to overcome the industry’s resistance.

滥用

Misuse

监管对软件行业来说可能是痛苦的,但对在秘密地下堡垒中策划统治世界的邪恶博士来说,则是无法容忍的。毫无疑问,犯罪分子、恐怖分子和流氓国家都有动机绕过对智能机器设计的任何限制,以便利用它们来控制武器或设计和开展犯罪活动。危险并不在于邪恶计划会得逞;而是他们会因失去对设计不良的智能系统的控制而失败——尤其是那些被灌输邪恶目标并被授予武器使用权的系统。

Regulation might be painful for the software industry, but it would be intolerable for Dr. Evil, plotting world domination in his secret underground bunker. There is no doubt that criminal elements, terrorists, and rogue nations would have an incentive to circumvent any constraints on the design of intelligent machines so that they could be used to control weapons or to devise and carry out criminal activities. The danger is not so much that the evil schemes would succeed; it is that they would fail by losing control over poorly designed intelligent systems—particularly ones imbued with evil objectives and granted access to weapons.

这并不是逃避监管的理由——毕竟,我们有反谋杀的法律,尽管这些法律经常被规避。然而,这确实造成了一个非常严重的治安问题。我们已经在与恶意软件和网络犯罪的斗争中落败了。(最近的一份报告估计有超过 20 亿受害者,每年损失约 6000 亿美元。4 以高智能程序形式出现的恶意软件将更难被击败。

This is not a reason to avoid regulation—after all, we have laws against murder even though they are often circumvented. It does, however, create a very serious policing problem. Already, we are losing the battle against malware and cybercrime. (A recent report estimates over two billion victims and an annual cost of around $600 billion.4) Malware in the form of highly intelligent programs would be much harder to defeat.

包括尼克·博斯特罗姆在内的一些人建议,我们应该利用我们自己的、有益的超级智能 AI 系统来检测和摧毁任何恶意或行为不当的 AI 系统。当然,我们应该利用我们掌握的工具,同时尽量减少对个人自由的影响,但人类挤在掩体里,无力抵御超级智能所释放的巨大力量,这种形象很难让人放心,即使其中一些超级智能站在我们这边。找到将恶意 AI 扼杀在萌芽状态的方法要好得多。

Some, including Nick Bostrom, have proposed that we use our own, beneficial superintelligent AI systems to detect and destroy any malicious or otherwise misbehaving AI systems. Certainly, we should use the tools at our disposal, while minimizing the impact on personal freedom, but the image of humans huddling in bunkers, defenseless against the titanic forces unleashed by battling superintelligences, is hardly reassuring even if some of them are on our side. It would be far better to find ways to nip the malicious AI in the bud.

一个好的开端将是成功、协调、国际打击网络犯罪的运动,包括扩大《布达佩斯网络犯罪公约》。这将为未来可能防止不受控制的人工智能程序出现的努力形成一个组织模板。同时,它将产生一种广泛的文化理解,即创建此类程序,无论是有意还是无意,从长远来看,都是一种与制造流行病生物相当的自杀行为。

A good first step would be a successful, coordinated, international campaign against cybercrime, including expansion of the Budapest Convention on Cybercrime. This would form an organizational template for possible future efforts to prevent the emergence of uncontrolled AI programs. At the same time, it would engender a broad cultural understanding that creating such programs, either deliberately or inadvertently, is in the long run a suicidal act comparable to creating pandemic organisms.

衰弱与人类自主

Enfeeblement and Human Autonomy

EM 福斯特最著名的小说,包括《霍华德庄园》《印度之行》,都探讨了 20 世纪早期的英国社会及其阶级制度。1909 年,他写了一篇著名的科幻小说《机器停止了》。这个故事因其预见性而引人注目,包括对(我们现在所说的)互联网、视频会议、iPad、大规模开放式在线课程 (MOOC)、普遍肥胖症以及避免面对面接触的描述。标题中的机器是一个满足人类所有需求的包罗万象的智能基础设施。人类越来越依赖它,但他们对它的工作原理的了解却越来越少。工程知识让位于仪式化的咒语,最终无法阻止机器运作的逐渐恶化。主角库诺看到了正在发生的事情,但却无力阻止它:

E. M. Forster’s most famous novels, including Howards End and A Passage to India, examined British society and its class system in the early part of the twentieth century. In 1909, he wrote one notable science-fiction story: “The Machine Stops.” The story is remarkable for its prescience, including depictions of (what we would now call) the Internet, videoconferencing, iPads, massive open online courses (MOOCs), widespread obesity, and avoidance of face-to-face contact. The Machine of the title is an all-encompassing intelligent infrastructure that meets all human needs. Humans become increasingly dependent on it, but they understand less and less about how it works. Engineering knowledge gives way to ritualized incantations that eventually fail to stem the gradual deterioration of the Machine’s workings. Kuno, the main character, sees what is unfolding but is powerless to stop it:

难道你看不出来……我们才是正在死去的,而这里唯一真正活着的东西就是机器?我们创造了机器来执行我们的意志,但现在我们不能让它执行我们的意志。它剥夺了我们的空间感和触觉,它模糊了每一种人际关系,它使我们的身体和意志瘫痪……我们只是作为在它的动脉中流动的血细胞而存在,如果它没有我们也能工作,它会让我们死去。哦,我没有办法——或者至少只有一个办法——就是一遍又一遍地告诉人们,我看到了威塞克斯的山丘,就像埃尔弗里德推翻丹麦人时看到的那样。

Cannot you see . . . that it is we that are dying, and that down here the only thing that really lives is the Machine? We created the Machine to do our will, but we cannot make it do our will now. It has robbed us of the sense of space and of the sense of touch, it has blurred every human relation, it has paralysed our bodies and our wills. . . . We only exist as the blood corpuscles that course through its arteries, and if it could work without us, it would let us die. Oh, I have no remedy—or, at least, only one—to tell men again and again that I have seen the hills of Wessex as Aelfrid saw them when he overthrew the Danes.

地球上曾生活过一千多亿人。他们(我们)花费了大约一万亿人年的时间来学习和教学,以便我们的文明得以延续。到目前为止,它延续的唯一可能性是通过在新一代人的头脑中重新创造。(纸张是一种很好的传输方式,但纸张上记录的知识只有在到达下一个人的头脑中时才会发挥作用。)这种情况正在改变:我们越来越有可能将知识放入机器中,这些机器本身可以为我们运行我们的文明。

More than one hundred billion people have lived on Earth. They (we) have spent on the order of one trillion person-years learning and teaching, in order that our civilization may continue. Up to now, its only possibility for continuation has been through re-creation in the minds of new generations. (Paper is fine as a method of transmission, but paper does nothing until the knowledge recorded thereon reaches the next person’s mind.) That is now changing: increasingly, it is possible to place our knowledge into machines that, by themselves, can run our civilization for us.

一旦将我们的文明传给下一代的现实动机消失,这个过程将很难逆转。一万亿年的累积学习将真正付诸东流。我们将成为一艘由机器驾驶的游轮上的乘客,踏上永无止境的航程——正如电影《机器人总动员》中所设想的那样。

Once the practical incentive to pass our civilization on to the next generation disappears, it will be very hard to reverse the process. One trillion years of cumulative learning would, in a real sense, be lost. We would become passengers in a cruise ship run by machines, on a cruise that goes on forever—exactly as envisaged in the film WALL-E.

一个好的结果论者会说:“显然,这是过度使用自动化的不良后果!设计合理的机器绝不会这样做!”没错,但想想这意味着什么。机器可能完全理解,人类的自主性和能力是我们喜欢的生活方式的重要方面。它们可能会坚持要求人类保留对自己福祉的控制权和责任——换句话说,机器会说不。但我们这些目光短浅、懒惰的人可能会不同意。这里存在着一场公地悲剧:对于任何个人来说,花费数年时间艰苦学习以获得机器已经拥有的知识和技能似乎毫无意义;但如果每个人都这样想,人类将集体失去自主权。

A good consequentialist would say, “Obviously this is an undesirable consequence of the overuse of automation! Suitably designed machines would never do this!” True, but think what this means. Machines may well understand that human autonomy and competence are important aspects of how we prefer to conduct our lives. They may well insist that humans retain control and responsibility for their own well-being—in other words, machines will say no. But we myopic, lazy humans may disagree. There is a tragedy of the commons at work here: for any individual human, it may seem pointless to engage in years of arduous learning to acquire knowledge and skills that machines already have; but if everyone thinks that way, the human race will, collectively, lose its autonomy.

这个问题的解决方案似乎是文化上的,而不是技术的。我们需要一场文化运动来重塑我们的理想和偏好,转向自主、自主权和能力,而不是自我放纵和依赖——如果你愿意的话,可以说是古代斯巴达军事精神的现代文化版本。这意味着人类偏好在全球范围内进行工程改造,同时彻底改变我们的社会运作方式。为了避免让糟糕的情况变得更糟,我们可能需要超级智能机器的帮助,无论是在制定解决方案方面,还是在为每个人实现平衡的实际过程中。

The solution to this problem seems to be cultural, not technical. We will need a cultural movement to reshape our ideals and preferences towards autonomy, agency, and ability and away from self-indulgence and dependency—if you like, a modern, cultural version of ancient Sparta’s military ethos. This would mean human preference engineering on a global scale along with radical changes in how our society works. To avoid making a bad situation worse, we might need the help of superintelligent machines, both in shaping the solution and in the actual process of achieving a balance for each individual.

任何有小孩的父母都熟悉这个过程。一旦孩子过了无助阶段,养育孩子就需要在为孩子做一切事情和让孩子完全自主之间不断取得平衡。在某个阶段,孩子开始明白父母完全有能力帮孩子系鞋带,但他们选择不这么做。这就是人类的未来吗——永远被更高级的机器当做孩子对待?我想不会。首先,孩子无法关掉父母的门。(谢天谢地!)我们也不会成为宠物或动物园动物。我们目前的世界确实没有与未来我们与有益的智能机器之间的关系类似的关系。最终结果如何还有待观察。

Any parent of a small child is familiar with this process. Once the child is beyond the helpless stage, parenting requires an ever-evolving balance between doing everything for the child and leaving the child entirely to his or her own devices. At a certain stage, the child comes to understand that the parent is perfectly capable of tying the child’s shoelaces but is choosing not to. Is that the future for the human race—to be treated like a child, forever, by far superior machines? I suspect not. For one thing, children cannot switch their parents off. (Thank goodness!) Nor will we be pets or zoo animals. There is really no analog in our present world to the relationship we will have with beneficial intelligent machines in the future. It remains to be seen how the endgame turns out.

附录 A

Appendix A

寻找解决方案

SEARCHING FOR SOLUTIONS

通过展望未来并考虑不同可能动作序列的结果来选择动作是智能系统的基本能力。每当您向手机询问方向时,它都会执行此操作。图 14显示了一个典型示例:从当前位置 19 号码头前往目标科伊特塔。算法需要知道它可以使用哪些动作;通常,对于地图导航,每个动作都会穿越连接两个相邻交叉路口的路段。在这个例子中,从 19 号码头出发只有一个动作:右转并沿着 Embarcadero 行驶到下一个交叉路口。然后有一个选择:继续行驶或急转左进入 Battery Street。算法会系统地探索所有这些可能性,直到最终找到一条路线。通常我们会添加一些常识性指导,例如优先探索通往目标而不是远离目标的街道。借助这种指导和其他一些技巧,算法可以非常快速地找到最佳解决方案——通常在几毫秒内,即使是跨国旅行也是如此。

Choosing an action by looking ahead and considering the outcomes of different possible action sequences is a fundamental capability for intelligent systems. It’s something your cell phone does whenever you ask it for directions. Figure 14 shows a typical example: getting from the current location, Pier 19, to the goal, Coit Tower. The algorithm needs to know what actions are available to it; typically, for map navigation, each action traverses a road segment connecting two adjacent intersections. In the example, from Pier 19 there is just one action: turn right and drive along the Embarcadero to the next intersection. Then there is a choice: continue on or take a sharp left onto Battery Street. The algorithm systematically explores all these possibilities until it eventually finds a route. Typically we add a little bit of commonsense guidance, such as a preference for exploring streets that head towards the goal rather than away from it. With this guidance and a few other tricks, the algorithm can find optimal solutions very quickly—usually in a few milliseconds, even for a cross-country trip.

在地图上搜索路线是一个自然而熟悉的例子,但它可能有点误导,因为不同地点的数量太少了。例如,在美国,只有大约十个百万个交叉点。这个数字看似很大,但与 15 谜题中不同州的数量相比,就微不足道了。15 谜题是一个玩具,它有一个四乘四的网格,包含十五个编号的方格和一片空白。游戏目标是移动方格,以实现目标配置,比如让所有方格按数字顺序排列。15 谜题大约有十万亿个状态(比美国的面积大一百万倍!);24 谜题大约有八万亿个状态。这就是数学家所说的组合复杂性的例子——随着问题的“活动部分”数量的增加,组合数量也会迅速激增。回到美国地图:如果一家卡车运输公司想要优化其一百辆卡车在全美的运输,那么需要考虑的可能州的数量将是一千万的一百次方(即 10 700)。

Searching for routes on maps is a natural and familiar example, but it may be a bit misleading because the number of distinct locations is so small. In the United States, for example, there are only about ten million intersections. That may seem like a large number, but it is tiny compared to the number of distinct states in the 15-puzzle. The 15-puzzle is a toy with a four-by-four grid containing fifteen numbered tiles and a blank space. The goal is to move the tiles around to achieve a goal configuration, such as having all the tiles in numerical order. The 15-puzzle has about ten trillion states (a million times bigger than the United States!); the 24-puzzle has about eight trillion trillion states. This is an example of what mathematicians call combinatorial complexity—the rapid explosion in the number of combinations as the number of “moving parts” of a problem increases. Returning to the map of the United States: if a trucking company wants to optimize the movements of its one hundred trucks across the United States, the number of possible states to consider would be ten million to the power of one hundred (i.e., 10700).

图 14:旧金山部分地区的地图,显示了初始位置在 19 号码头,目的地在科伊特塔。

FIGURE 14: A map of part of San Francisco, showing the initial location at Pier 19 and the destination at Coit Tower.

放弃理性决策

Giving up on rational decisions

许多游戏都具有这种组合复杂性,包括国际象棋、跳棋、西洋双陆棋和围棋。由于围棋规则简单而优雅(图 15),我将使用它作为示例。目标很明确:通过比对手包围更多领地来赢得游戏。可能的操作也很明确:将一颗棋子放在空位。就像在地图上导航一样,决定做什么的显而易见的方法是想象由不同动作序列导致的不同未来,然后选择最佳的一个。你会问:“如果我这样做,我的对手会怎么做?那我该怎么做?”图 16以 3×3 围棋为例说明了这一想法。即使对于 3×3 围棋,我也只能展示可能未来树的一小部分,但我希望这个想法足够清楚。事实上,这种决策方式似乎只是直截了当的常识。

Many games have this property of combinatorial complexity, including chess, checkers, backgammon, and Go. Because the rules of Go are simple and elegant (figure 15), I’ll use it as a running example. The objective is clear enough: win the game by surrounding more territory than your opponent. The possible actions are clear too: put a stone in an empty location. Just as with navigation on a map, the obvious way to decide what to do is to imagine different futures that result from different sequences of actions and choose the best one. You ask, “If I do this, what might my opponent do? And what do I do then?” This idea is illustrated in figure 16 for 3×3 Go. Even for 3×3 Go, I can show only a small part of the tree of possible futures, but I hope the idea is clear enough. Indeed, this way of making decisions seems to be just straightforward common sense.

图 15:2002 年 LG 杯决赛李世石(黑棋)与崔明勋(白棋)第五局的围棋棋盘。黑棋和白棋轮流在棋盘上任意空位下棋。现在轮到黑棋下棋,一共有 343 种可能的走法。每一方都试图包围尽可能多的领地。例如,白棋很有可能赢得左侧边缘和下边缘左侧的领地,而黑棋则有可能赢得右上角和右下角的领地。围棋的一个关键概念是群子,一组通过垂直或水平邻接相互连接的同色棋子。只要群子旁边至少有一个空位,它就可以存活;如果它被完全包围,没有空位,它就会死亡并从棋盘上移除。

FIGURE 15: A Go board, partway through Game 5 of the 2002 LG Cup final between Lee Sedol (black) and Choe Myeong-hun (white). Black and White take turns placing a single stone on any unoccupied location on the board. Here, it is Black’s turn to move and there are 343 possible moves. Each side attempts to surround as much territory as possible. For example, White has good chances to win territory at the left-hand edge and on the left side of the bottom edge, while Black may win territory in the top-right and bottom-right corners. A key concept in Go is that of a group—that is, a set of stones of the same color that are connected to one another by vertical or horizontal adjacency. A group remains alive as long as there is at least one empty space next to it; if it is completely surrounded, with no empty spaces, it dies and is removed from the board.

图 16:3×3 围棋博弈树的一部分。从空的初始状态(有时称为树的)开始,黑方可以从三种可能的不同走法中选择一种。(其他走法与这些走法对称。)然后轮到白方下棋。如果黑方选择在中心下棋,白方有两种不同的走法——角走法或边走法——那么黑方可以再次下棋。通过想象这些可能的未来走向,黑方可以选择在初始状态下走哪一步。如果黑方无法将所有可能的走法都遵循到底,那么可以使用评估函数来估计树叶子位置的优劣。这里,评估函数将 +5 和 +3 赋值给其中两片树叶子。

FIGURE 16: Part of the game tree for 3×3 Go. Starting from the empty initial state, sometimes called the root of the tree, Black can choose one of three possible distinct moves. (The others are symmetric with these.) It would then be White’s turn to move. If Black chooses to play in the center, White has two distinct moves—corner or side—then Black would get to play again. By imagining these possible futures, Black can choose which move to play in the initial state. If Black is unable to follow every possible line of play to the end of the game, then an evaluation function can be used to estimate how good the positions are at the leaves of the tree. Here, the evaluation function assigns +5 and +3 to two of the leaves.

问题是,围棋在 19×19 的整个棋盘上有超过 10 170 种可能的位置。虽然在地图上找到一条保证最短的路线相对容易,但在围棋中找到一条保证获胜的路线却完全不可行。即使算法在接下来的十亿年里思考,它也只能探索整个可能性树的一小部分。这引出了两个问题。首先,程序应该探索树的哪一部分?其次,考虑到它已经探索的部分树,程序应该采取哪一步?

The problem is that Go has more than 10170 possible positions for the full 19×19 board. Whereas finding a guaranteed shortest route on a map is relatively easy, finding a guaranteed win in Go is utterly infeasible. Even if the algorithm ponders for the next billion years, it can explore only a tiny fraction of the whole tree of possibilities. This leads to two questions. First, which part of the tree should the program explore? And second, which move should the program make, given the partial tree that it has explored?

首先回答第二个问题:几乎所有前瞻程序使用的基本思想是先从树的顶部开始寻找最远的未来状态,然后“回溯”找出根节点的选择有多好。1例如,查看图 16底部的两个位置,人们可能会猜测左边位置的值为 +5(从黑方的角度来看),右边位置的值为 +3,因为白方在角落的棋子比边上的棋子脆弱得多。如果这些值正确,那么黑方可以预期白方会下边,从而得到右边的位置;因此,将黑方在中心的初始举动赋值为 +3 似乎是合理的。稍有不同,1955 年,亚瑟·塞缪尔的跳棋程序用这个方案击败了它的创造者; 1997 年,深蓝用这个方案击败了当时的世界象棋冠军加里·卡斯帕罗夫;2016 年,AlphaGo 用这个方案击败了前世界围棋冠军李世石。深蓝程序中评估树叶位置的部分由人类编写,主要基于他们对象棋的了解。而塞缪尔的程序和 AlphaGo 程序则是从数千场或数百万场练习赛中学习的。

To answer the second question first: the basic idea used by almost all lookahead programs is to assign an estimated value to the “leaves” of the tree—those states furthest in the future—and then “work back” to find out how good the choices are at the root.1 For example, looking at the two positions at the bottom of figure 16, one might guess a value of +5 (from Black’s viewpoint) for the position on the left and +3 for the position on the right, because White’s stone in the corner is much more vulnerable than the one on the side. If these values are right, then Black can expect that White will play on the side, leading to the right-hand position; hence, it seems reasonable to assign a value of +3 to Black’s initial move in the center. With slight variations, this is the scheme used by Arthur Samuel’s checker-playing program to beat its creator in 1955,2 by Deep Blue to beat the then world chess champion, Garry Kasparov, in 1997, and by AlphaGo to beat former world Go champion Lee Sedol in 2016. For Deep Blue, humans wrote the piece of the program that evaluates positions at the leaves of the tree, based largely on their knowledge of chess. For Samuel’s program and for AlphaGo, the programs learned it from thousands or millions of practice games.

第一个问题——程序应该探索树的哪一部分?——是人工智能中最重要的问题之一:代理应该做哪些计算?对于玩游戏的程序来说,这至关重要,因为它们只有很少的固定时间,将其用于无意义的计算肯定会失败。对于人类和其他在现实世界中运作的代理来说,这甚至更为重要,因为现实世界要复杂得多:除非选择得当,否则再多的计算也无法对决定做什么的问题产生丝毫的影响。如果你正在开车,一只驼鹿走到路中间,考虑是否要用欧元换英镑,或者黑棋是否应该在围棋棋盘的中心先走一步,都是毫无意义的。

The first question—which part of the tree should the program explore?—is an example of one of the most important questions in AI: What computations should an agent do? For game-playing programs, it is vitally important because they have only a small, fixed allocation of time, and using it on pointless computations is a sure way to lose. For humans and other agents operating in the real world, it is even more important because the real world is so much more complex: unless chosen well, no amount of computation is going to make the smallest dent in the problem of deciding what to do. If you are driving and a moose walks into the middle of the road, it’s no use thinking about whether to trade euros for pounds or whether Black should make its first move in the center of the Go board.

人类管理计算活动的能力,以便合理快速地做出合理决策,这种能力至少与他们感知和正确推理的能力一样出色。这似乎是我们自然而然、毫不费力地获得的东西:当我的父亲教我下棋,教我规则,但他并没有教我如何巧妙地选择探索博弈树的哪些部分以及忽略哪些部分。

The ability of humans to manage their computational activity so that reasonable decisions get made reasonably quickly is at least as remarkable as their ability to perceive and to reason correctly. And it seems to be something we acquire naturally and effortlessly: when my father taught me to play chess, he taught me the rules, but he did not also teach me such-and-such clever algorithm for choosing which parts of the game tree to explore and which parts to ignore.

这是怎么发生的?我们凭什么来引导我们的思想?答案是,计算的价值在于它能够提高你的决策质量。选择计算的过程称为元推理,即对推理进行推理正如可以根据预期价值理性地选择行动一样,计算也可以这样做。这称为理性元推理。3基本思想非常简单:

How does this happen? On what basis can we direct our thoughts? The answer is that a computation has value to the extent that it can improve your decision quality. The process of choosing computations is called metareasoning, which means reasoning about reasoning. Just as actions can be chosen rationally, on the basis of expected value, so can computations. This is called rational metareasoning.3 The basic idea is very simple:

进行能够最大程度提高决策质量的计算,并在成本(时间方面)超过预期提高时停止。

Do the computations that will give the highest expected improvement in decision quality, and stop when the cost (in terms of time) exceeds the expected improvement.

就是这样。不需要花哨的算法!这个简单的原则在包括国际象棋和围棋在内的各种问题中产生了有效的计算行为。我们的大脑似乎也实现了类似的东西,这解释了为什么我们不需要为学习玩的每款新游戏学习新的、特定于游戏的思考算法。

That’s it. No fancy algorithm needed! This simple principle generates effective computational behavior in a wide range of problems, including chess and Go. It seems likely that our brains implement something similar, which explains why we don’t need to learn new, game-specific algorithms for thinking with each new game we learn to play.

当然,探索从当前状态延伸到未来的可能性树并不是做出决策的唯一方法。通常,从目标开始倒推更有意义。例如,路上有驼鹿表明目标是避免撞到驼鹿,这反过来又表明了三种可能的动作:左转、右转或猛踩刹车。它并不表明用欧元换英镑或在中心放一块黑石头的行为。因此,目标对人的思维有很好的聚焦作用。目前没有任何游戏程序利用这个想法;事实上,它们通常会考虑所有可能的合法行动。这是我不担心 AlphaZero 统治世界的(众多)原因之一。

Exploring a tree of possibilities that stretches forward into the future from the current state is not the only way to reach decisions, of course. Often, it makes more sense to work backwards from the goal. For example, the presence of the moose in the road suggests the goal of avoid hitting the moose, which in turn suggests three possible actions: swerve left, swerve right, or slam on the brakes. It does not suggest the action of trading euros for pounds or putting a black stone in the center. Thus, goals have a wonderful focusing effect on one’s thinking. No current game-playing programs take advantage of this idea; in fact, they typically consider all possible legal actions. This is one of the (many) reasons why I am not worried about AlphaZero taking over the world.

展望未来

Looking further ahead

假设您已决定在围棋棋盘上进行特定操作。太好了!现在您必须真正地执行它。在现实世界中,这包括伸手到未下棋的棋盘中拿起一颗棋子,将手移到预定位置上方,然后根据围棋礼仪,将棋子整齐地放在该位置,无论是安静地还是着重地。

Let’s suppose you have decided to make a specific move on the Go board. Great! Now you have to actually do it. In the real world, this involves reaching into the bowl of unplayed stones to pick up a stone, moving your hand above the intended location, and placing the stone neatly on the spot, either quietly or emphatically according to Go etiquette.

反过来,每个阶段都包含复杂的感知和运动控制命令,涉及手、臂、肩和眼的肌肉和神经。当你伸手去拿石头时,你要确保身体的其他部分不会因为重心的转移而翻倒。你可能没有意识到选择这些动作,但这并不意味着你的大脑没有选择它们。例如,碗里可能有很多石头,但你的“手”——实际上是处理感官信息的大脑——仍然必须选择其中的一块来捡起。

Each of these stages, in turn, consists of a complex dance of perception and motor control commands involving the muscles and nerves of the hand, arm, shoulder, and eyes. And while reaching for a stone, you’re making sure the rest of your body doesn’t topple over thanks to the shift in your center of gravity. The fact that you may not be consciously aware of selecting these actions does not mean that they aren’t being selected by your brain. For example, there may be many stones in the bowl, but your “hand”—really, your brain processing sensory information—still has to choose one of them to pick up.

我们所做的几乎每件事都是这样的。开车时,我们可能会选择向左变换车道;但这一动作需要看后视镜和肩膀,也许还要调整速度,并移动方向盘,同时监控进度,直到操作完成。在谈话中,一个常规的回应,例如“好的,让我看看我的日历,然后再回复你”,需要发出十四个音节,每个音节都需要数百个精确协调的运动控制命令,这些命令会发送到舌头、嘴唇、下巴、喉咙和呼吸器官的肌肉上。对于你的母语来说,这个过程是自动的;它与在计算机程序中运行子程序的想法非常相似(参见此页面)。复杂的动作序列可以变得常规和自动,从而在更复杂的过程中作为单个动作发挥作用,这一事实对人类认知来说是绝对基本的。用一种不太熟悉的语言说单词——比如问去波兰的 Szczebrzeszyn 的路——是一种有用的提醒你,在你的生活中曾有过这样的时刻,阅读和说话是一项困难的任务,需要付出脑力劳动和大量的练习。

Almost everything we do is like this. While driving, we might choose to change lanes to the left; but this action involves looking in the mirror and over your shoulder, perhaps adjusting speed, and moving the steering wheel while monitoring progress until the maneuver is complete. In conversation, a routine response such as “OK, let me check my calendar and get back to you” involves articulating fourteen syllables, each of which requires hundreds of precisely coordinated motor control commands to the muscles of the tongue, lips, jaw, throat, and breathing apparatus. For your native language, this process is automatic; it closely resembles the idea of running a subroutine in a computer program (see this page). The fact that complex action sequences can become routine and automatic, thereby functioning as single actions in still more complex processes, is absolutely fundamental to human cognition. Saying words in a less familiar language—perhaps asking directions to Szczebrzeszyn in Poland—is a useful reminder that there was a time in your life when reading and speaking words were difficult tasks requiring mental effort and lots of practice.

因此,大脑面临的真正问题不是在围棋棋盘上选择走子,而是向肌肉发送运动控制命令。如果我们将注意力从围棋走子转移到运动控制命令,问题看起来就大不相同了。粗略地说,大脑大约每 100 毫秒可以发出一次命令。我们大约有 600 块肌肉,所以理论上最多每秒约 6000 次动作,每小时 2000 万次,每年 2000 亿次,一生 2000 万亿次。明智地使用它们吧!

So, the real problem that your brain faces is not choosing a move on the Go board but sending motor control commands to your muscles. If we shift our attention from the level of Go moves to the level of motor control commands, the problem looks very different. Very roughly, your brain can send out commands about every one hundred milliseconds. We have about six hundred muscles, so that’s a theoretical maximum of about six thousand actuations per second, twenty million per hour, two hundred billion per year, twenty trillion per lifetime. Use them wisely!

现在,假设我们尝试应用类似 AlphaZero 的算法来解决这一级别的决策问题。在围棋中,AlphaZero 可以向前看大约 50 步。但 50 步运动控制命令只能让你提前几秒钟!对于一小时围棋游戏中的 2000 万个运动控制命令来说,这还不够,对于攻读博士学位所涉及的万亿(1,000,000,000,000)步来说,当然也不够。因此,尽管 AlphaZero 在围棋中比任何人都能看得更远,但这种能力似乎在现实世界中没有帮助。这是错误的向前看。

Now, suppose we tried to apply an AlphaZero-like algorithm to solve the decision problem at this level. In Go, AlphaZero looks ahead perhaps fifty steps. But fifty steps of motor control commands get you only a few seconds into the future! Not enough for the twenty million motor control commands in an hour-long game of Go, and certainly not enough for the trillion (1,000,000,000,000) steps involved in doing a PhD. So, even though AlphaZero looks further ahead in Go than any human can, that ability doesn’t seem to help in the real world. It’s the wrong kind of lookahead.

当然,我并不是说攻读博士学位实际上需要提前规划出一万亿个肌肉动作。最初只制定相当抽象的计划——可能是选择伯克利或其他地方,选择博士导师或研究课题,申请资金,获得学生签证,前往所选城市,做一些研究,等等。为了做出选择,你只需要对正确的事情进行足够的思考,这样决定就会变得清晰。如果某些抽象步骤(例如获得签证)的可行性不清楚,你会做更多的思考,也许还要收集信息,这意味着在某些方面使计划更加具体:也许选择你有资格获得的签证类型,收集必要的文件,并提交申请。图 17显示了抽象计划和将 GetVisa 步骤细化为三步子计划。当开始执行计划时,必须将其初始步骤细化到原始水平,以便您的身体能够执行它们。

I’m not saying, of course, that doing a PhD actually requires planning out a trillion muscle actuations in advance. Only quite abstract plans are made initially—perhaps choosing Berkeley or some other place, choosing a PhD supervisor or research topic, applying for funding, getting a student visa, traveling to the chosen city, doing some research, and so on. To make your choices, you do just enough thinking, about just the right things, so that the decision becomes clear. If the feasibility of some abstract step such as getting the visa is unclear, you do some more thinking and perhaps information gathering, which means making the plan more concrete in certain aspects: maybe choosing a visa type for which you are eligible, collecting the necessary documents, and submitting the application. Figure 17 shows the abstract plan and the refinement of the GetVisa step into a three-step subplan. When the time comes to begin carrying out the plan, its initial steps have to be refined all the way down to the primitive level so that your body can execute them.

图 17:为选择在伯克利攻读博士学位的留学生制定的抽象计划。GetVisa 步骤的可行性尚不确定,但已扩展为一项抽象计划。

FIGURE 17: An abstract plan for an overseas student who has chosen to get a PhD at Berkeley. The GetVisa step, whose feasibility is uncertain, has been expanded out into an abstract plan of its own.

AlphaGo 根本无法进行这种思考:它唯一考虑的动作是从初始状态开始按顺序发生的原始动作。它没有抽象计划的概念。试图将 AlphaGo 应用到现实世界中就像试图通过思考第一个字母应该是 A、B、C 还是其他来写小说一样。

AlphaGo simply cannot do this kind of thinking: the only actions it ever considers are primitive actions occurring in a sequence from the initial state. It has no notion of abstract plan. Trying to apply AlphaGo in the real world is like trying to write a novel by wondering whether the first letter should be A, B, C, and so on.

1962 年,赫伯特·西蒙在著名论文《复杂性的架构》中强调了层次化组织的重要性。4自 20 世纪 70 年代初以来,人工智能研究人员已经开发出各种方法来构建和完善层次化组织的计划。5由此产生的一些系统能够构建包含数千万个步骤的计划,例如,组织大型工厂中的制造活动。

In 1962, Herbert Simon emphasized the importance of hierarchical organization in a famous paper, “The Architecture of Complexity.”4 AI researchers since the early 1970s have developed a variety of methods that construct and refine hierarchically organized plans.5 Some of the resulting systems are able to construct plans with tens of millions of steps—for example, to organize manufacturing activities in a large factory.

我们现在对抽象动作的含义有了很好的理论理解,即如何定义它们对世界的影响。6例如,考虑图 17中的抽象动作 GoToBerkeley 。它可以以多种不同的方式实现,每种方式都会对世界产生不同的影响:你可以乘船去那里,偷渡到船上,飞到加拿大然后步行过境,雇一个私人飞机等等。但你现在不需要考虑这些选择。只要你确定有一种方法可以做到这一点,既不会耗费太多时间和金钱,也不会带来太多风险,从而危及计划的其余部分,你就可以将抽象步骤 GoToBerkeley 放入计划中,并确保该计划能够奏效。通过这种方式,我们可以构建高级计划,最终将其变成数十亿或数万亿个原始步骤,而无需担心这些步骤是什么,直到真正执行它们的时候。

We now have a pretty good theoretical understanding of the meaning of abstract actions—that is, of how to define the effects they have on the world.6 Consider, for example, the abstract action GoToBerkeley in figure 17. It can be implemented in many different ways, each of which produces different effects on the world: you could sail there, stow away on a ship, fly to Canada and walk across the border, hire a private jet, and so on. But you need not consider any of these choices for now. As long as you are sure there is a way to do it that doesn’t consume so much time and money or incur so much risk as to imperil the rest of the plan, you can just put the abstract step GoToBerkeley into the plan and rest assured that the plan will work. In this way, we can build high-level plans that will eventually turn into billions or trillions of primitive steps without ever worrying about what those steps are until it’s time to actually do them.

当然,如果没有层次结构,这一切都不可能实现。如果没有获得签证和撰写论文等高级行动,我们就无法制定获得博士学位的抽象计划;如果没有获得博士学位和创办公司等更高级别的行动,我们就无法计划获得博士学位然后创办公司。在现实世界中,如果没有数十个抽象层次的庞大行动库,我们就会迷失方向。(在围棋游戏中,没有明显的行动层次结构,所以我们大多数人都会迷失方向。)然而,目前所有现有的分层规划方法都依赖于人类生成的抽象和具体行动层次结构;我们还不了解如何从经验中学习这种层次结构。

Of course, none of this is possible without the hierarchy. Without high-level actions such as getting a visa and writing a thesis, we cannot make an abstract plan to get a PhD; without still-higher-level actions such as getting a PhD and starting a company, we cannot plan to get a PhD and then start a company. In the real world, we would be lost without a vast library of actions at dozens of levels of abstraction. (In the game of Go, there is no obvious hierarchy of actions, so most of us are lost.) At present, however, all existing methods for hierarchical planning rely on a human-generated hierarchy of abstract and concrete actions; we do not yet understand how such hierarchies can be learned from experience.

附录 B

Appendix B

知识与逻辑

KNOWLEDGE AND LOGIC

逻辑是研究基于确定知识的推理。它对于主题而言是完全通用的 — 也就是说,知识可以是关于任何事物的。因此,逻辑是我们理解通用智能不可或缺的一部分。

Logic is the study of reasoning with definite knowledge. It is fully general with regard to subject matter—that is, the knowledge can be about anything at all. Logic is therefore an indispensable part of our understanding of general purpose intelligence.

逻辑的主要要求是一种形式语言,其中的句子具有精确的含义,这样就可以有一个明确的过程来确定句子在特定情况下是真还是假。就是这样。一旦我们有了它,我们就可以编写可靠的推理算法,从已知的句子中生成新的句子。这些新句子保证遵循系统已知的句子,这意味着在原始句子为真的情况下,新句子必然为真。这使得机器能够回答问题、证明数学定理或制定保证成功的计划。

Logic’s main requirement is a formal language with precise meanings for the sentences in the language, so that there is an unambiguous process for determining whether a sentence is true or false in a given situation. That’s it. Once we have that, we can write sound reasoning algorithms that produce new sentences from sentences that are already known. Those new sentences are guaranteed to follow from the sentences that the system already knows, meaning that the new sentences are necessarily true in any situation where the original sentences are true. This allows a machine to answer questions, prove mathematical theorems, or construct plans that are guaranteed to succeed.

高中代数提供了一个很好的例子(尽管它可能勾起痛苦的回忆)。形式语言包括这样的句子:4 x + 1 = 2 y − 5。这个句子在x = 5 和y = 13 的情况下为真,在x = 5 和y = 6 的情况下为假。从这个句子可以得出另一个句子,例如y = 2 x + 3,并且只要第一个句子为真,第二个句子也一定为真。

High-school algebra provides a good example (albeit one that may evoke painful memories). The formal language includes sentences such as 4x + 1 = 2y − 5. This sentence is true in the situation where x = 5 and y = 13, and false when x = 5 and y = 6. From this sentence one can derive another sentence such as y = 2x + 3, and whenever the first sentence is true, the second is guaranteed to be true too.

逻辑的核心思想是在古印度、中国和希腊独立发展起来的,即精确含义和合理推理的概念可以应用于任何事物的句子,而不仅仅是数字。典型的例子是从“苏格拉底是人”和“所有人都会死”开始,然后得出“苏格拉底是会死的”。1这种推导是严格形式化的,因为它不依赖于任何关于苏格拉底是谁或人和会死意味着什么的进一步信息逻辑推理严格形式化的,这意味着可以编写算法来实现这一点。

The core idea of logic, developed independently in ancient India, China, and Greece, is that the same notions of precise meaning and sound reasoning can be applied to sentences about anything at all, not just numbers. The canonical example starts with “Socrates is a man” and “All men are mortal” and derives “Socrates is mortal.”1 This derivation is strictly formal in the sense that it does not rely on any further information about who Socrates is or what man and mortal mean. The fact that logical reasoning is strictly formal means that it is possible to write algorithms that do it.

命题逻辑

Propositional logic

为了理解人工智能的能力和前景,有两种重要的逻辑非常重要:命题逻辑和一阶逻辑。两者之间的区别对于理解人工智能的现状及其可能的发展至关重要。

For our purposes in understanding the capabilities and prospects for AI, there are two important kinds of logic that really matter: propositional logic and first-order logic. The difference between the two is fundamental to understanding the current situation in AI and how it is likely to evolve.

让我们先从命题逻辑开始,它是两者中比较简单的一种。句子仅由两种东西组成:表示命题的符号,这些命题可以是真或假,以及逻辑连接词,例如“与”“或” 、 “非”和“如果 ...那么”。(我们很快就会看到一个例子。)这些逻辑连接词有时被称为“布尔” ,以乔治·布尔 (George Boole) 的名字命名,他是 19 世纪的逻辑学家,用新的数学思想重新振兴了他的领域。它们与计算机芯片中使用的逻辑门完全相同。

Let’s start with propositional logic, which is the simpler of the two. Sentences are made of just two kinds of things: symbols that stand for propositions that can be true or false, and logical connectives such as and, or, not, and if . . . then. (We’ll see an example shortly.) These logical connectives are sometimes called Boolean, after George Boole, a nineteenth-century logician who reinvigorated his field with new mathematical ideas. They are just the same as the logic gates used in computer chips.

自 20 世纪 60 年代初以来,命题逻辑中的实用推理算法就已为人所知。2、3尽管一般的推理任务在最坏的情况下可能需要指数级的时间,4但现代命题推理算法可以处理包含数百万命题符号和数千万个句子的问题。它们是构建有保障的物流计划、在制造芯片设计之前对其进行验证以及在部署软件应用程序和安全协议之前检查其正确性的核心工具。令人惊奇的是,只要将这些任务表述为推理任务,单一算法(命题逻辑的推理算法)就能解决所有这些任务。显然,这是朝着智能系统通用性目标迈出的一步。

Practical algorithms for reasoning in propositional logic have been known since the early 1960s.2,3 Although the general reasoning task may require exponential time in the worst case,4 modern propositional reasoning algorithms handle problems with millions of proposition symbols and tens of millions of sentences. They are a core tool for constructing guaranteed logistical plans, verifying chip designs before they are manufactured, and checking the correctness of software applications and security protocols before they are deployed. The amazing thing is that a single algorithm—a reasoning algorithm for propositional logic—solves all these tasks once they have been formulated as reasoning tasks. Clearly, this is a step towards the goal of generality in intelligent systems.

不幸的是,这并不是一个很大的进步,因为命题逻辑的语言不太具有表达力。让我们看看这在实践中意味着什么,当我们试图表达围棋中合法走子的基本规则时:“轮到走子的人可以在任何未占据的交叉点上着子。” 5第一步是决定在谈论围棋走子和围棋棋盘位置时命题符号是什么。重要的基本命题是特定颜色的棋子是否在特定时间位于特定位置。因此,我们需要诸如White_Stone_On_5_5_At_Move_38Black_Stone_On_5_5_At_Move_38之类的符号。(请记住,与manmortalSocrates一样,推理算法不需要知道符号的含义。)那么白棋能够在第 38 步在 5,5 交叉点着子逻辑条件将是

Unfortunately, it’s not a very big step because the language of propositional logic is not very expressive. Let’s see what this means in practice when we try to express the basic rule for legal moves in Go: “The player whose turn it is to move can play a stone on any unoccupied intersection.”5 The first step is to decide what the proposition symbols are going to be for talking about Go moves and Go board positions. The fundamental proposition that matters is whether a stone of a particular color is on a particular location at a particular time. So, we’ll need symbols such as White_Stone_On_5_5_At_Move_38 and Black_Stone_On_5_5_At_Move_38. (Remember that, as with man, mortal, and Socrates, the reasoning algorithm doesn’t need to know what the symbols mean.) Then the logical condition for White to be able to play at the 5,5 intersection at move 38 would be

不是 White_Stone_On_5_5_At_Move_38并且

(not White_Stone_On_5_5_At_Move_38) and

不是 Black_Stone_On_5_5_At_Move_38

(not Black_Stone_On_5_5_At_Move_38)

换句话说:没有白子,也没有黑子。这似乎很简单。不幸的是,在命题逻辑中,必须为游戏中的每个位置和每一步单独写出它。因为每场游戏有 361 个位置和大约 300 步,这意味着有超过 100,000 份规则副本!对于涉及多个棋子和位置的捕获和重复规则,情况更糟,我们很快就写满了数百万页。

In other words: there’s no white stone and there’s no black stone. That seems simple enough. Unfortunately, in propositional logic it would have to be written out separately for each location and for each move in the game. Because there are 361 locations and around 300 moves per game, this means over 100,000 copies of the rule! For the rules concerning captures and repetitions, which involve multiple stones and locations, the situation is even worse, and we quickly fill up millions of pages.

显然,现实世界比围棋棋盘大得多:远不止 361 个位置和 300 个时间步骤,而且除了棋子之外还有很多种东西;因此,使用命题语言来获取现实世界知识的前景是完全没有希望的。

The real world is, obviously, much bigger than the Go board: there are far more than 361 locations and 300 time steps, and there are many kinds of things besides stones; so, the prospect of using a propositional language for knowledge of the real world is utterly hopeless.

问题不仅仅在于规则手册的荒谬长度,还在于此外,学习系统需要大量的经验才能从例子中获取规则。人类只需要一两个例子就能掌握下棋、吃子等基本概念,而基于命题逻辑的智能系统则必须看到在每个位置和时间步骤分别移动和吃子的例子。系统无法像人类一样从几个例子中概括出来,因为它无法表达一般规则。这种限制不仅适用于基于命题逻辑的系统,也适用于任何具有同等表达能力的系统。其中包括贝叶斯网络(命题逻辑的概率表亲)和神经网络(人工智能“深度学习”方法的基础)。

It’s not just the ridiculous size of the rulebook that’s a problem: it’s also the ridiculous amount of experience a learning system would need to acquire the rules from examples. While a human needs just one or two examples to get the basic ideas of placing a stone, capturing stones, and so on, an intelligent system based on propositional logic has to be shown examples of moving and capturing separately for each location and time step. The system cannot generalize from a few examples, as a human does, because it has no way to express the general rule. This limitation applies not just to systems based on propositional logic but also to any system with comparable expressive power. That includes Bayesian networks, which are probabilistic cousins of propositional logic, and neural networks, which are the basis for the “deep learning” approach to AI.

一阶逻辑

First-order logic

那么,下一个问题是,我们能否设计出一种更具表现力的逻辑语言?我们希望能够通过以下方式将围棋规则告知基于知识的系统:

So, the next question is, can we devise a more expressive logical language? We’d like one in which it is possible to tell the rules of Go to the knowledge-based system in the following way:

对于棋盘上的所有位置、所有时间步骤,均遵守以下规则……

for all locations on the board, and for all time steps, here are the rules . . .

一阶逻辑由德国数学家戈特洛布·弗雷格于 1879 年提出,它允许人们以这种方式编写规则。6命题逻辑和一逻辑之间的关键区别在于:命题逻辑假设世界是由真或假的命题组成的,而一阶逻辑假设世界是由可以以各种方式相互关联的对象组成的。例如,可能存在彼此相邻的位置、连续的时间、在特定时间处于位置上的石头以及在特定时间合法的移动。一阶逻辑允许人们断言某些属性对世界上的所有对象都成立;因此,人们可以写

First-order logic, introduced by the German mathematician Gottlob Frege in 1879, allows one to write the rules this way.6 The key difference between propositional and first-order logic is this: whereas propositional logic assumes the world is made of propositions that are true or false, first-order logic assumes the world is made of objects that can be related to each other in various ways. For example, there could be locations that are adjacent to each other, times that follow each other consecutively, stones that are on locations at particular times, and moves that are legal at particular times. First-order logic allows one to assert that some property is true for all objects in the world; so, one can write

对于所有时间步长t所有位置l以及所有颜色c

如果在时间t轮到c移动,并且l在时间t未被占用,

那么,c在时间t 在位置l下棋就是合法的

for all time steps t, and for all locations l, and for all colors c,

if it is c’s turn to move at time t and l is unoccupied at time t,

then it is legal for c to play a stone at location l at time t.

加上一些额外的注意事项和一些定义棋盘位置、两种颜色以及空棋的含义的附加语句,我们就有了完整的围棋规则的雏形。这些规则在一阶逻辑中占据的空间与在英语中占据的空间大致相同。

With some extra caveats and some additional sentences that define the board locations, the two colors, and what unoccupied means, we have the beginnings of the complete rules of Go. The rules take up about as much space in first-order logic as they do in English.

20 世纪 70 年代末逻辑编程的发展为逻辑推理提供了优雅而高效的技术,这种技术体现在一种名为 Prolog 的编程语言中。计算机科学家研究出了如何使 Prolog 中的逻辑推理以每秒数百万个推理步骤运行,从而使许多逻辑应用变得实用。1982 年,日本政府宣布对基于 Prolog 的人工智能进行巨额投资,称为第五代项目7 美国和英国也做出了类似的努力。8、9

The development of logic programming in the late 1970s provided elegant and efficient technology for logical reasoning embodied in a programming language called Prolog. Computer scientists worked out how to make logical reasoning in Prolog run at millions of reasoning steps per second, making many applications of logic practical. In 1982, the Japanese government announced a huge investment in Prolog-based AI called the Fifth Generation project,7 and the United States and UK responded with similar efforts.8,9

不幸的是,第五代人工智能项目和其他类似项目在 20 世纪 80 年代末和 90 年代初失去了动力,部分原因是逻辑无法处理不确定的信息。它们代表了一个很快成为贬义词的术语:老式人工智能,简称 GOFAI。10人们认为逻辑与人工智能无关,这成为一种时尚;事实上,现在在深度学习领域工作的很多人工智能研究人员对逻辑一无所知。这种时尚似乎很可能会消退:如果你接受世界上有以各种方式相互关联的对象,那么一阶逻辑将是相关的,因为它提供了对象和关系的基本数学。谷歌 DeepMind 首席执行官 Demis Hassabis 也持这种观点:11

Unfortunately, the Fifth Generation project and others like it ran out of steam in the late 1980s and early 1990s, partly because of the inability of logic to handle uncertain information. They epitomized what soon became a pejorative term: Good Old-Fashioned AI, or GOFAI.10 It became fashionable to dismiss logic as irrelevant to AI; indeed, many AI researchers working now in the area of deep learning don’t know anything about logic. This fashion seems likely to fade: if you accept that the world has objects in it that are related to each other in various ways, then first-order logic is going to be relevant, because it provides the basic mathematics of objects and relations. This view is shared by Demis Hassabis, CEO of Google DeepMind:11

你可以把深度学习想象成大脑中的感觉皮层:视觉皮层或听觉皮层。但真正的智能远不止这些,你必须将其重新组合成更高层次的思维和符号推理,很多都是经典人工智能在 80 年代尝试解决的问题。

……我们希望[这些系统]能够达到这种符号层次的推理——数学、语言和逻辑。所以这是我们工作的很大一部分。

You can think about deep learning as it currently is today as the equivalent in the brain to our sensory cortices: our visual cortex or auditory cortex. But, of course, true intelligence is a lot more than just that, you have to recombine it into higher-level thinking and symbolic reasoning, a lot of the things classical AI tried to deal with in the 80s.

. . . We would like [these systems] to build up to this symbolic level of reasoning—maths, language, and logic. So that’s a big part of our work.

因此,人工智能研究的前三十年最重要的经验之一是,一个程序要想在任何意义上了解事物,就需要至少与一阶逻辑相当的表示和推理能力。到目前为止,我们还不知道这种能力的具体形式:它可能被纳入概率推理系统、深度学习系统或一些尚待发明的混合设计中。

Thus, one of the most important lessons from the first thirty years of AI research is that a program that knows things, in any useful sense, will need a capacity for representation and reasoning that is at least comparable to that offered by first-order logic. As yet, we do not know the exact form this will take: it may be incorporated into probabilistic reasoning systems, into deep learning systems, or into some still-to-be-invented hybrid design.

附录 C

Appendix C

不确定性和概率

UNCERTAINTY AND PROBABILITY

逻辑为使用确定性知识进行推理提供了一般基础,而概率论则涵盖使用不确定信息进行推理(确定性知识是其中的特例)。不确定性是现实世界中主体的正常认知情况。尽管概率的基本思想是在 17 世纪发展起来的,但直到最近才有可能以形式化的方式表示和推理大型概率模型。

Whereas logic provides a general basis for reasoning with definite knowledge, probability theory encompasses reasoning with uncertain information (of which definite knowledge is a special case). Uncertainty is the normal epistemic situation of an agent in the real world. Although the basic ideas of probability were developed in the seventeenth century, only recently has it become possible to represent and reason with large probability models in a formal way.

概率基础

The basics of probability

概率论与逻辑学一样,都认为存在可能世界。人们通常首先定义它们是什么——例如,如果我掷一个普通的六面骰子,就会有六个世界(有时称为结果):1、2、3、4、5、6。其中只有一个是这种情况,但我不知道是哪一个。概率论假设可以为每个世界附加一个概率;对于我的掷骰子,我将为每个世界附加1/6 这些概率恰好是相等,但不一定如此;唯一的要求是概率之和必须等于 1。)现在我可以问这样的问题:“我掷出偶数的概率是多少?”为了得到这个概率,我只需将三个数字为偶数的概率相加:1 / 6 + 1 / 6 + 1 / 6 = ½。

Probability theory shares with logic the idea that there are possible worlds. One usually starts out by defining what they are—for example, if I am rolling one ordinary six-sided die, there are six worlds (sometimes called outcomes): 1, 2, 3, 4, 5, 6. Exactly one of them will be the case, but a priori I don’t know which. Probability theory assumes that it is possible to attach a probability to each world; for my die roll, I’ll attach 1/6 to each world. (These probabilities happen to be equal, but it need not be that way; the only requirement is that the probabilities have to add up to 1.) Now I can ask a question such as “What’s the probability I’ll roll an even number?” To find this, I simply add up the probabilities for the three worlds where the number is even: 1/6 + 1/6 + 1/6 = ½.

考虑新证据也很简单。假设一个预言告诉我掷出的结果是质数(即 2、3 或 5)。这排除了世界 1、4 和 6。我只需取与剩余可能世界相关的概率并将它们按比例放大,使总数保持为 1。现在 2、3 和 5 的概率分别为 ⅓,而我掷出的结果是偶数的概率现在只有 ⅓,因为 2 是唯一剩下的偶数掷出结果。随着新证据的出现而更新概率的过程是贝叶斯更新的一个例子。

It’s also straightforward to take new evidence into account. Suppose an oracle tells me that the roll is a prime number (that is, 2, 3, or 5). This rules out the worlds 1, 4, and 6. I simply take the probabilities associated with the remaining possible worlds and scale them up so the total remains 1. Now the probabilities of 2, 3, and 5 are each ⅓, and the probability that my roll is an even number is now just ⅓, since 2 is the only remaining even roll. This process of updating probabilities as new evidence arrives is an example of Bayesian updating.

所以,概率这个东西看起来相当简单!甚至计算机都可以把数字相加,那么问题是什么呢?当存在多个世界时,问题就出现了。例如,如果我掷骰子一百次,就会有 6100 种结果。通过为每个结果单独附加一个数字来开始概率推理的过程是不可行的。处理这种复杂性的一个线索来自这样一个事实:如果已知骰子是公平的,则掷骰子是独立的——也就是说,任何一次掷骰子的结果都不会影响任何其他掷骰子结果的概率。因此,独立性有助于构建复杂事件集的概率。

So, this probability stuff seems quite simple! Even a computer can add up numbers, so what’s the problem? The problem comes when there are more than a few worlds. For example, if I roll the die one hundred times, there are 6100 outcomes. It’s infeasible to begin the process of probabilistic reasoning by attaching a number to each of these outcomes individually. A clue for dealing with this complexity comes from the fact that the die rolls are independent if the die is known to be fair—that is, the outcome of any single roll does not affect the probabilities for the outcomes of any other roll. Thus, independence is helpful in structuring the probabilities for complex sets of events.

假设我和儿子乔治正在玩大富翁。我的棋子在 Just Visiting 上,乔治拥有黄色棋盘,其房产距离 Just Visiting 有十六、十七和十九个方格。他现在应该为黄色棋盘买房子吗?这样如果我落在这些方格上,我就得付给他高昂的租金,还是他应该等到下一轮?这取决于我当前回合落在黄色棋盘上的概率。

Suppose I am playing Monopoly with my son George. My piece is on Just Visiting, and George owns the yellow set whose properties are sixteen, seventeen, and nineteen squares away from Just Visiting. Should he buy houses for the yellow set now, so that I have to pay him some exorbitant rent if I land on those squares, or should he wait until the next turn? That depends on the probability of landing on the yellow set in my current turn.

大富翁游戏掷骰子的规则如下:掷两个骰子,根据显示的总数移动棋子;如果掷出双数掷出,玩家再次掷骰子并再次移动;如果第二次掷出双倍,玩家第三次掷骰子并再次移动(但如果第三次掷出双倍,玩家将入狱)。因此,例如,我可能会掷出 4-4,然后是 5-4,总计 17;或者掷出 2-2,然后是 2-2,然后是 6-2,总计 16。和以前一样,我只是将所有落在黄色集合上的世界的概率加起来。不幸的是,世界有很多。总共可以掷出多达六个骰子,因此世界的数量达到了数千个。此外,掷骰子不再是独立的,因为除非第一次掷出双倍,否则第二次掷骰子不会存在。另一方面,如果我们固定第一对骰子的值,那么第二对骰子的值就是独立的。有没有办法捕捉这种依赖性?

Here are the rules for rolling the dice in Monopoly: two dice are rolled and the piece is moved according to the total shown; if doubles are rolled, the player rolls again and moves again; if the second roll is doubles, the player rolls a third time and moves again (but if the third roll is doubles, the player goes to jail instead). So, for example, I might roll 4-4 followed by 5-4, totaling 17; or 2-2, then 2-2, then 6-2, totaling 16. As before, I simply add up the probabilities of all worlds where I land on the yellow set. Unfortunately, there are a lot of worlds. As many as six dice could be rolled altogether, so the number of worlds runs into the thousands. Furthermore, the rolls are no longer independent, because the second roll won’t exist unless the first roll is doubles. On the other hand, if we fix the values of the first pair of dice, then the values of the second pair of dice are independent. Is there a way to capture this kind of dependency?

贝叶斯网络

Bayesian networks

20 世纪 80 年代初,Judea Pearl 提出了一种称为贝叶斯网络(通常缩写为贝叶斯网络)的形式化语言,该语言可以在许多现实世界中以非常简洁的形式表示大量结果的概率。1

In the early 1980s, Judea Pearl proposed a formal language called Bayesian networks (often abbreviated to Bayes nets) that makes it possible, in many real-world situations, to represent the probabilities of a very large number of outcomes in a very concise form.1

图 18显示了描述大富翁游戏中掷骰子的贝叶斯网络。必须提供的概率只是每次掷骰子(D 1 、D 2 等)的值为 1、2、3、4、5、6 的 1/6 概率,也就是说,是三十六个数字,而不是数千个。解释该网络的确切含义需要一点数学知识,2 但基本思想是箭头表示依赖关系,例如,双骰 12 的值取决于 D 1D 2类似地 D 3D 4两个骰子一次掷出取决于骰12,因为如果12的值为false,D 3D 4的值为 0(也就是说,没有下一次掷出)。

Figure 18 shows a Bayesian network that describes the rolling of dice in Monopoly. The only probabilities that have to be supplied are the 1/6 probabilities of the values 1, 2, 3, 4, 5, 6 for the individual die rolls (D1, D2, etc.)—that is, thirty-six numbers instead of thousands. Explaining the exact meaning of the network requires a little bit of mathematics,2 but the basic idea is that the arrows denote dependency relationships—for example, the value of Doubles12 depends on the values of D1 and D2. Similarly, the values of D3 and D4 (the next roll of the two dice) depend on Doubles12 because if Doubles12 has value false, then D3 and D4 have value 0 (that is, there is no next roll).

就像命题逻辑一样,有些算法可以用任何证据回答任何贝叶斯网络的任何问题。例如,我们可以询问LandsOnYellowSet的概率,结果是 3.88% 左右。(这意味着乔治可以等一段时间再买黄色房子。)稍微大胆一点,我们可以求出在第二次掷骰子是双 3 的情况下LandsOnYellowSet的概率。算法自己算出,在这种情况下,第一次掷骰子一定是双 3,并得出结论答案约为 36.1%。这是贝叶斯更新的一个例子:当添加新证据(第二次掷骰子是双 3)时,LandsOnYellowSet的概率从 3.88% 变为 36.1百分比。同样,我掷出三次(Doubles 34为真)的概率是 2.78%,而如果我掷出黄色集合,我掷出三次的概率是 20.44%。

Just as with propositional logic, there are algorithms that can answer any question for any Bayesian network with any evidence. For example, we can ask for the probability of LandsOnYellowSet, which turns out to be about 3.88 percent. (This means that George can wait before buying houses for the yellow set.) Slightly more ambitiously, we can ask for the probability of LandsOnYellowSet given that the second roll is a double-3. The algorithm works out for itself that, in that case, the first roll must have been a double and concludes that the answer is about 36.1 percent. This is an example of Bayesian updating: when the new evidence (that the second roll is a double-3) is added, the probability of LandsOnYellowSet changes from 3.88 percent to 36.1 percent. Similarly, the probability that I roll three times (Doubles34 is true) is 2.78 percent, while the probability that I roll three times given that I land on the yellow set is 20.44 percent.

图 18:贝叶斯网络表示大富翁游戏中掷骰子的规则,它使算法能够计算从其他方格(如 Just Visiting)开始落在特定方格(如黄色方格)上的概率。 (为简单起见,网络忽略了落在机会或社区宝箱方格上并被转移到其他位置的可能性。)D 1D 2表示两个骰子的初始掷骰,它们是独立的(它们之间没有联系)。 如果掷出双骰(双骰12),则玩家再次掷骰,因此D 3D 4具有非零值,依此类推。 在所述情况下,如果三个总数中的任何一个为 16、17 或 19,则玩家落在黄色方格上。

FIGURE 18: A Bayesian network that represents the rules for rolling dice in Monopoly and enables an algorithm to calculate the probability of landing on a particular set of squares (such as the yellow set) starting from some other square (such as Just Visiting). (For simplicity, the network omits the possibility of landing on a Chance or Community Chest square and being diverted to a different location.) D1 and D2 represent the initial roll of two dice and they are independent (no link between them). If doubles are rolled (Doubles12), then the player rolls again, so D3 and D4 have non-zero values, and so on. In the situation described, the player lands on the yellow set if any of the three totals is 16, 17, or 19.

贝叶斯网络提供了一种构建知识型系统的方法,可以避免 20 世纪 80 年代困扰基于规则的专家系统的失败。(事实上,如果人工智能界在 20 世纪 80 年代初不那么抵制概率,它可能会避免基于规则的专家系统泡沫之后的人工智能寒冬。)数千种应用已经投入使用,涉及从医疗诊断到恐怖主义预防等各个领域。3

Bayesian networks provide a way to build knowledge-based systems that avoids the failures that plagued the rule-based expert systems of the 1980s. (Indeed, had the AI community been less resistant to probability in the early 1980s, it might have avoided the AI winter that followed the rule-based expert system bubble.) Thousands of applications have been fielded, in areas ranging from medical diagnosis to terrorism prevention.3

贝叶斯网络提供了表示必要概率和执行计算的机制,以实现许多复杂任务的贝叶斯更新。然而,与命题逻辑一样,它们在表示一般知识方面的能力非常有限。在许多应用中,贝叶斯网络表示变得非常庞大且重复——例如,就像围棋规则必须在命题逻辑中对每个方格重复一样,大富翁的基于概率的规则必须对每个玩家、玩家可能所在的每个位置以及游戏中的每步重复。如此庞大的网络几乎不可能手工创建;相反,人们必须求助于用 C++ 等传统语言编写的代码来生成和拼凑多个贝叶斯网络片段。虽然这作为特定问题的工程解决方案很实用,但它阻碍了通用性,因为 C++ 代码必须由人类专家为每个应用程序重新编写。

Bayesian networks provide machinery for representing the necessary probabilities and performing the calculations to implement Bayesian updating for many complex tasks. Like propositional logic, however, they are quite limited in their ability to represent general knowledge. In many applications, the Bayesian network representation becomes very large and repetitive—for example, just as the rules of Go have to be repeated for every square in propositional logic, the probability-based rules of Monopoly have to be repeated for every player, for every location a player might be on, and for every move in the game. Such huge networks are virtually impossible to create by hand; instead, one would have to resort to code written in a traditional language such as C++ to generate and piece together multiple Bayes net fragments. While this is practical as an engineering solution for a specific problem, it is an obstacle to generality because the C++ code has to be written anew by a human expert for each application.

一阶概率语言

First-order probabilistic languages

幸运的是,我们可以将一阶逻辑的表达能力与贝叶斯网络简洁地捕获概率信息的能力结合起来。这种结合为我们提供了两全其美的结果:基于概率知识的系统能够比逻辑方法或贝叶斯网络能处理更广泛的现实情况。例如,我们可以轻松捕获有关遗传的概率知识:

It turns out, fortunately, that we can combine the expressiveness of first-order logic with the ability of Bayesian networks to capture probabilistic information concisely. This combination gives us the best of both worlds: probabilistic knowledge-based systems are able to handle a much wider range of real-world situations than either logical methods or Bayesian networks. For example, we can easily capture probabilistic knowledge about genetic inheritance:

对于所有人cfm

如果 f是c 的父亲 m是c的母亲

并且fm都是AB 型血,

那么 c的血型为 AB 的概率为 0.5。

for all persons c, f, and m,

if f is the father of c and m is the mother of c

and both f and m have blood type AB,

then c has blood type AB with probability 0.5.

一阶逻辑与概率的结合实际上为我们提供了远不止一种表达大量物体不确定信息的方法。原因是,当我们将不确定性添加到包含物体的世界中时,我们得到了两种新的不确定性:不仅仅是关于哪些事实是真或假的不确定性,还有关于哪些物体存在的不确定性以及关于哪些物体是哪些物体的不确定性。这些不确定性是完全普遍存在的。世界并不像维多利亚时代的戏剧那样带有人物列表;相反,你会从观察中逐渐了解物体的存在。

The combination of first-order logic and probability actually gives us much more than just a way to express uncertain information about lots of objects. The reason is that when we add uncertainty to worlds containing objects, we get two new kinds of uncertainty: not just uncertainty about which facts are true or false but also uncertainty about what objects exist and uncertainty about which objects are which. These kinds of uncertainty are completely pervasive. The world does not come with a list of characters, like a Victorian play; instead, you gradually learn about the existence of objects from observation.

有时,对新事物的认知可以相当明确,比如当你打开酒店窗户,第一次看到圣心大教堂时;或者,对新事物的认知也可以相当模糊,比如当你感觉到轻微的隆隆声时,那可能是地震或地铁列车经过的声音。圣心大教堂的身份非常明确,但地铁列车的身份却并非如此:你可能会乘坐同一列火车数百次,却从未意识到它是同一列。有时我们不需要解决不确定性:我通常不会给一袋樱桃番茄中的所有番茄命名,也不会记录每颗番茄的生长情况,除非我正在记录番茄腐烂实验的进展。另一方面,对于一个满是研究生的班级,我会尽力记录他们的身份。 (曾经,我的小组中有两名研究助理,他们的名字和姓氏相同,长相非常相似,并且研究的主题密切相关;至少,我很确定是两个。)问题是我们直接感知的不是物体的身份而是其外观(方面);物体通常没有小车牌来唯一地标识它们。身份是我们头脑有时会为了我们自己的目的而赋予物体的东西。

Sometimes the knowledge of new objects can be fairly definite, as when you open your hotel window and see the basilica of Sacré-Cœur for the first time; or it can be quite indefinite, as when you feel a gentle rumble that might be an earthquake or a passing subway train. And while the identity of Sacré-Cœur is quite unambiguous, the identity of subway trains is not: you might ride the same physical train hundreds of times without ever realizing it’s the same one. Sometimes we don’t need to resolve the uncertainty: I don’t usually name all the tomatoes in a bag of cherry tomatoes and keep track of how well each one is doing, unless perhaps I am recording the progress of a tomato putrefaction experiment. For a class full of graduate students, on the other hand, I try my best to keep track of their identities. (Once, there were two research assistants in my group who had the same first and last names and were of very similar appearance and worked on closely related topics; at least, I am fairly sure there were two.) The problem is that we directly perceive not the identity of objects but (aspects of) their appearance; objects do not usually have little license plates that uniquely identify them. Identity is something our minds sometimes attach to objects for our own purposes.

概率论与富有表现力的形式语言相结合是人工智能的一个相当新的子领域,通常称为概率编程。4目前已经开发了几十种概率编程语言,即 PPL,其中许多语言的表达能力源自普通编程语言,而不是一阶逻辑。所有 PPL 系统都具有表示和推理复杂、不确定知识的能力。应用包括微软的 TrueSkill 系统,该系统每天对数百万视频游戏玩家进行评分;人类认知方面以前无法用任何机械假设解释的模型,例如从单个示例中学习新的物体视觉类别的能力;5以及《全面禁止核试验条约》(CTBT)的全球地震监测,负责检测秘密核爆炸。6

The combination of probability theory with an expressive formal language is a fairly new subfield of AI, often called probabilistic programming.4 Several dozen probabilistic programming languages, or PPLs, have been developed, many of them deriving their expressive power from ordinary programming languages rather than first-order logic. All PPL systems have the capacity to represent and reason with complex, uncertain knowledge. Applications include Microsoft’s TrueSkill system, which rates millions of video game players every day; models for aspects of human cognition that were previously inexplicable by any mechanistic hypothesis, such as the ability to learn new visual categories of objects from single examples;5 and the global seismic monitoring for the Comprehensive Nuclear-Test-Ban Treaty (CTBT), which is responsible for detecting clandestine nuclear explosions.6

CTBT 监测系统从全球 150 多个地震仪网络收集实时地面运动数据,旨在识别地球上发生的所有超过一定震级的地震事件并标记可疑事件。显然,这个问题存在很多不确定性,因为我们事先不知道将要发生的事件;此外,数据中的绝大多数信号都只是噪音。身份也存在很多不确定性:在南极洲 A 站检测到的地震能量波段可能来自也可能不来自在巴西 B 站检测到的另一个波段的同一事件。聆听地球就像聆听成千上万个同时进行的对话,这些对话被传输延迟和回声打乱,并被海浪淹没。

The CTBT monitoring system collects real-time ground movement data from a global network of over 150 seismometers and aims to identify all the seismic events occurring on Earth above a certain magnitude and to flag the suspicious ones. Clearly there is plenty of existence uncertainty in this problem, because we don’t know in advance the events that will occur; moreover, the vast majority of signals in the data are just noise. There is also lots of identity uncertainty: a blip of seismic energy detected at station A in Antarctica may or may not come from the same event as another blip detected at station B in Brazil. Listening to the Earth is like listening to thousands of simultaneous conversations that have been scrambled by transmission delays and echoes and drowned out by crashing waves.

我们如何使用概率编程来解决这个问题?有人可能会认为我们需要一些非常聪明的算法来理清所有的可能性。事实上,通过遵循基于知识的系统的方法,我们根本不需要设计任何新的算法。我们只需使用 PPL 来表达我们对地球物理学的了解:自然地震区发生事件的频率、地震波穿过地球的速度和衰减速度、探测器的灵敏度以及噪音的大小。然后我们添加数据并运行概率推理算法。由此产生的监测系统称为 NET-VISA,自 2018 年以来一直作为条约核查制度的一部分运行。图 19显示了 NET-VISA 对朝鲜 2013 年核试验的检测。

How do we solve this problem using probabilistic programming? One might think we need some very clever algorithms to sort out all the possibilities. In fact, by following the methodology of knowledge-based systems, we don’t have to devise any new algorithms at all. We simply use a PPL to express what we know of geophysics: how often events tend to occur in areas of natural seismicity, how fast seismic waves travel through the Earth and how quickly they decay, how sensitive the detectors are, and how much noise there is. Then we add the data and run a probabilistic reasoning algorithm. The resulting monitoring system, called NET-VISA, has been operating as part of the treaty verification regime since 2018. Figure 19 shows NET-VISA’s detection of a 2013 nuclear test in North Korea.

图 19:朝鲜政府于 2013 年 2 月 12 日进行的核试验的位置估计。隧道入口(中下部黑色十字)在卫星照片中被识别。NET-VISA 的位置估计距离隧道入口约 700 米,主要基于 4,000 至 10,000 公里外站点的探测结果。CTBTO 低能爆弹头位置是地球物理学专家的一致估计。

FIGURE 19: Location estimates for the February 12, 2013, nuclear test carried out by the government of North Korea. The tunnel entrance (black cross at lower center) was identified in satellite photographs. The NET-VISA location estimate is approximately 700 meters from the tunnel entrance and is based primarily on detections at stations 4,000 to 10,000 kilometers away. The CTBTO LEB location is the consensus estimate from expert geophysicists.

关注世界

Keeping track of the world

概率推理最重要的作用之一是追踪世界上不可直接观察的部分。在在大多数视频和棋盘游戏中,这是不必要的,因为所有相关信息都是可观察的,但在现实世界中这种情况很少见。

One of the most important roles for probabilistic reasoning is in keeping track of parts of the world that are not directly observable. In most video and board games, this is unnecessary because all the relevant information is observable, but in the real world this is seldom the case.

图 20:(左)事故发生前的情况图。自动驾驶沃尔沃汽车(标记为 V)正在接近十字路口,以每小时 38 英里的速度在最右侧车道上行驶。其他两条车道上的车辆已停止行驶,交通信号灯(L)变为黄色。沃尔沃汽车看不到一辆本田汽车(H)正在左转;(右)事故发生后的情况。

FIGURE 20: (left) Diagram of the situation leading up to the accident. The self-driving Volvo, marked V, is approaching an intersection, driving in the rightmost lane at thirty-eight miles per hour. Traffic in the other two lanes is stopped and the traffic light (L) is turning yellow. Invisible to the Volvo, a Honda (H) is making a left turn; (right) aftermath of the accident.

一个例子是首批涉及自动驾驶汽车的严重事故之一。该事故于 2017 年 3 月 24 日发生在亚利桑那州坦佩市南麦克林托克大道与东唐卡洛斯大道的交汇处。7如图20所示,一辆自动驾驶沃尔沃 (V) 在麦克林托克大道向南行驶,正接近一个交通信号灯刚变黄的十字路口。沃尔沃的车道畅通无阻,因此它以相同的速度通过十字路口。然后,一辆目前看不见的车辆(图 20中的本田 (H))从停下的车流后面出现,随后发生了碰撞。

An example is given by one of the first serious accidents involving a self-driving car. It occurred on South McClintock Drive at East Don Carlos Avenue in Tempe, Arizona, on March 24, 2017.7 As shown in figure 20, a self-driving Volvo (V), going south on McClintock, is approaching an intersection where the traffic light is just turning yellow. The Volvo’s lane is clear, so it proceeds at the same speed through the intersection. Then a currently invisible vehicle—the Honda (H) in figure 20—appears from behind the queue of stopped traffic and a collision ensues.

为了推断隐形本田车可能存在,沃尔沃可以在接近十字路口时收集线索。特别是,尽管灯是绿灯,但其他两条车道上的交通却停止了;队列前面的汽车没有缓慢地向前驶入十字路口,刹车灯也亮着。这不是隐形左转车的确凿证据,但也不必如此;即使是很小的这个概率足以建议人们减速并更加谨慎地进入路口。

To infer the possible presence of the invisible Honda, the Volvo could gather clues as it approaches the intersection. In particular, the traffic in the other two lanes is stopped even though the light is green; the cars at the front of the queue are not inching forward into the intersection and have their brake lights on. This is not conclusive evidence of an invisible left turner but it doesn’t need to be; even a small probability is enough to suggest slowing down and entering the intersection more cautiously.

这个故事的寓意是,在部分可观察的环境中运行的智能代理必须根据他们可以看到的线索,尽可能地追踪他们看不到的事物。

The moral of this story is that intelligent agents operating in partially observable environments have to keep track of what they can’t see—to the extent possible—based on clues from what they can see.

这是另一个更贴近现实的例子:你的钥匙在哪里?除非你碰巧在开车时阅读这本书(不建议),否则你现在可能看不到它们。另一方面,你可能知道它们在哪里:在你的口袋里、在你的包里、在床头柜上、在你挂着的外套口袋里,或者在厨房的挂钩上。你知道这一点,因为你把它们放在那里,而且从那以后它们就没动过。这是一个使用知识和推理来跟踪世界状态的简单例子。

Here’s another example closer to home: Where are your keys? Unless you happen to be driving while reading this book—not recommended—you probably cannot see them right now. On the other hand, you probably know where they are: in your pocket, in your bag, on the bedside table, in the pocket of your coat which is hanging up, or maybe on the hook in the kitchen. You know this because you put them there and they haven’t moved since. This is a simple example of using knowledge and reasoning to keep track of the state of the world.

如果没有这种能力,我们就会迷路——通常真的是迷路。例如,当我写这篇文章时,我正看着一间不起眼的酒店房间的白墙。我在哪里?如果我必须依靠我当前的感知输入,我确实会迷路。事实上,我知道我在苏黎世,因为我昨天到达苏黎世,我还没有离开。和人类一样,机器人需要知道它们在哪里,这样它们才能成功地穿过房间、建筑物、街道、森林和沙漠。

Without this capability, we would be lost—often quite literally. For example, as I write this, I am looking at the white wall of a nondescript hotel room. Where am I? If I had to rely on my current perceptual input, I would indeed be lost. In fact, I know that I am in Zürich, because I arrived in Zürich yesterday and I haven’t left. Like humans, robots need to know where they are so that they can navigate successfully through rooms, buildings, streets, forests, and deserts.

在人工智能中,我们使用信念状态这个术语来指代代理对世界状态的当前了解——无论这种了解有多么不完整和不确定。一般来说,信念状态——而不是当前的感知输入——是做出决定的正确基础。保持信念状态的更新是任何智能代理的核心活动。对于信念状态的某些部分,这是自动发生的——例如,我似乎知道我在苏黎世,而不必去想它。对于其他部分,它可以说是按需发生的。例如,当我在长途旅行中途醒来时发现自己身处一个陌生的城市,时差非常严重,我可能不得不有意识地努力重建我在哪里,我应该在哪里做什么,以及为什么——我想这有点像笔记本电脑自己重启。保持追踪并不意味着总是确切地知道世界上一切事物的状态。显然这是不可能的——例如,我不知道谁住在我位于苏黎世的一家不起眼的酒店的其他房间里,更不用说地球上 80 亿人中大多数人的当前位置和活动了。我对太阳系以外宇宙其他地方发生的事情一无所知。我对当前事态的不确定性既巨大又不可避免。

In AI we use the term belief state to refer to an agent’s current knowledge of the state of the world—however incomplete and uncertain it may be. Generally, the belief state—rather than the current perceptual input—is the proper basis for making decisions about what to do. Keeping the belief state up to date is a core activity for any intelligent agent. For some parts of the belief state, this happens automatically—for example, I just seem to know that I’m in Zürich, without having to think about it. For other parts, it happens on demand, so to speak. For example, when I wake up in a new city with severe jet lag, halfway through a long trip, I may have to make a conscious effort to reconstruct where I am, what I am supposed to be doing, and why—a bit like a laptop rebooting itself, I suppose. Keeping track doesn’t mean always knowing exactly the state of everything in the world. Obviously this is impossible—for example, I have no idea who is occupying the other rooms in my nondescript hotel in Zürich, let alone the present locations and activities of most of the eight billion people on Earth. I haven’t the faintest idea what’s happening in the rest of the universe beyond the solar system. My uncertainty about the current state of affairs is both massive and inevitable.

跟踪不确定世界的基本方法是贝叶斯更新。执行此操作的算法通常分为两个步骤:预测步骤,其中代理根据其最近的操作预测世界的当前状态;然后是更新步骤,其中代理接收新的感知输入并相应地更新其信念。为了说明其工作原理,请考虑机器人在确定其位置时面临的问题。图 21(a)说明了一个典型案例:机器人位于房间中间,对其确切位置有些不确定,并且想要穿过门。它命令轮子向门移动 1.5 米;不幸的是,它的轮子已经老旧并且摇晃不定,因此机器人对它最终到达位置的预测非常不确定,如图21(b)所示。如果它现在试图继续移动,它很可能会撞车。幸运的是,它有一个声纳装置来测量到门柱的距离。如图21(c)所示,测量结果表明机器人距离左门柱约 70 厘米,距离右门柱约 85 厘米。最后,机器人通过将 (b) 中的预测与 (c) 中的测量值相结合来更新其信念状态,以获得图 21(d)中的新信念状态。

The basic method for keeping track of an uncertain world is Bayesian updating. Algorithms for doing this usually have two steps: a prediction step, where the agent predicts the current state of the world given its most recent action, and then an update step, where it receives new perceptual input and updates its beliefs accordingly. To illustrate how this works, consider the problem a robot faces in figuring out where it is. Figure 21(a) illustrates a typical case: The robot is in the middle of a room, with some uncertainty about its exact location, and wants to go through the door. It commands its wheels to move 1.5 meters towards the door; unfortunately, its wheels are old and wobbly, so the robot’s prediction about where it ends up is quite uncertain, as shown in figure 21(b). If it tried to keep moving now, it might well crash. Fortunately, it has a sonar device to measure the distance to the doorposts. As figure 21(c) shows, the measurements suggest the robot is about 70 centimeters from the left doorpost and 85 centimeters from the right. Finally, the robot updates its belief state by combining the prediction in (b) with the measurements in (c) to obtain the new belief state in figure 21(d).

跟踪信念状态的算法不仅可用于处理位置不确定性,还可用于处理地图本身的不确定性。这产生了一种称为 SLAM(同步定位和地图构建)的技术。SLAM 是许多 AI 应用的核心组件,从增强现实系统到自动驾驶汽车和行星探测车。

The algorithm for keeping track of the belief state can be applied to handle not just uncertainty about location but also uncertainty about the map itself. This results in a technique called SLAM (simultaneous localization and mapping). SLAM is a core component of many AI applications, ranging from augmented reality systems to self-driving cars and planetary rovers.

图 21:机器人试图穿过门口。 (a) 初始信念状态:机器人对自己的位置有些不确定;它试图向门移动 1.5 米。 (b) 预测步骤:机器人估计它离门更近,但对于它实际移动的方向非常不确定,因为它的电机老旧,轮子摇晃不稳。 (c) 机器人使用质量较差的声纳设备测量到每个门柱的距离;估计距离左门柱 70 厘米,距离右门柱 85 厘米。 (d) 更新步骤:将 (b) 中的预测与 (c) 中的观察相结合,得出新的信念状态。现在,机器人对自己的位置有了很好的了解,需要稍微调整一下路线才能穿过门。

FIGURE 21: A robot trying to move through a doorway. (a) The initial belief state: the robot is somewhat uncertain of its location; it tries to move 1.5 meters towards the door. (b) The prediction step: the robot estimates that it is closer to the door but is quite uncertain about the direction it actually moved because its motors are old and its wheels wobbly. (c) The robot measures the distance to each doorpost using a poor-quality sonar device; the estimates are 70 centimeters from the left doorpost and 85 centimeters from the right. (d) The update step: combining the prediction in (b) with the observation in (c) gives the new belief state. Now the robot has a pretty good idea of where it is and will need to correct its course a bit to get through the door.

附录 D

Appendix D

从经验中学习

LEARNING FROM EXPERIENCE

学习意味着根据经验提高表现。对于视觉感知系统来说,这可能意味着根据看到这些类别的示例来学习识别更多类别的物体;对于基于知识的系统,简单地获得更多知识也是一种学习,因为这意味着系统可以回答更多问题;对于像 AlphaGo 这样的前瞻性决策系统,学习可能意味着提高其评估位置的能力或提高其探索可能性树有用部分的能力。

Learning means improving performance based on experience. For a visual perception system, that might mean learning to recognize more categories of objects based on seeing examples of those categories; for a knowledge-based system, simply acquiring more knowledge is a form of learning, because it means the system can answer more questions; for a lookahead decision-making system such as AlphaGo, learning could mean improving its ability to evaluate positions or improving its ability to explore useful parts of the tree of possibilities.

从例子中学习

Learning from examples

最常见的机器学习形式称为监督学习。监督学习算法会获得一组训练示例,每个示例都标有正确的输出,并且必须对正确规则提出假设。通常,监督学习系统会寻求优化假设与训练示例之间的一致性。通常,对于比必要更复杂的假设,也会有惩罚——正如奥卡姆剃刀所建议的那样。

The most common form of machine learning is called supervised learning. A supervised learning algorithm is given a collection of training examples, each labeled with the correct output, and must produce a hypothesis as to what the correct rule is. Typically, a supervised learning system seeks to optimize the agreement between the hypothesis and the training examples. Often there is also a penalty for hypotheses that are more complicated than necessary—as recommended by Ockham’s razor.

图 22:围棋中合法和非法的着法:着法 A、B 和 C 对黑棋合法,而着法 D、E 和 F 则不合法。着法 G 可能合法也可能不合法,这取决于对局之前发生的情况。

FIGURE 22: Legal and illegal moves in Go: moves A, B, and C are legal for Black, while moves D, E, and F are illegal. Move G might or might not be legal, depending on what has happened previously in the game.

让我们以学习围棋中合法走法的问题为例来说明这一点。(如果你已经知道围棋规则,那么至少这很容易理解;如果不知道,那么你将能够更好地理解学习程序。)假设算法从假设开始

Let’s illustrate this for the problem of learning the legal moves in Go. (If you already know the rules of Go, then at least this will be easy to follow; if not, then you’ll be better able to sympathize with the learning program.) Suppose the algorithm starts with the hypothesis

对于所有时间步长t所有位置l

在时间t在位置l处下棋是合法的。

for all time steps t, and for all locations l,

it is legal to play a stone at location l at time t.

轮到黑棋在图 22所示的位置移动。算法尝试 A:没问题。B 和 C 也一样。然后它尝试 D,在现有的白棋之上:这是非法的。(在国际象棋或西洋双陆棋中,这样做是没问题的——棋子就是这样被吃掉的。)在 E 处移动,在黑棋之上,也是非法的。(在国际象棋中也是非法的,但在西洋双陆棋中是合法的。)现在,从这五个训练示例中,算法可能会提出以下假设:

It is Black’s turn to move in the position shown in figure 22. The algorithm tries A: that’s fine. B and C too. Then it tries D, on top of an existing white piece: that’s illegal. (In chess or backgammon, it would be fine—that’s how pieces are captured.) The move at E, on top of a black piece, is also illegal. (Illegal in chess too, but legal in backgammon.) Now, from these five training examples, the algorithm might propose the following hypothesis:

对于所有时间步长t所有位置l

如果 l在时间t未被占用,

那么在时间t在位置l处下棋就是合法的。

for all time steps t, and for all locations l,

if l is unoccupied at time t,

then it is legal to play a stone at location l at time t.

然后它尝试 F,并惊讶地发现 F 是非法的。经过几次失败的尝试后,它决定采用以下方法:

Then it tries F and finds to its surprise that F is illegal. After a few false starts, it settles on the following:

对于所有时间步长t所有位置l

如果 l在时间t 未被占用,并且

l没有被对手的棋子包围,

那么在时间t 在位置l处下棋就是合法的

for all time steps t, and for all locations l,

if l is unoccupied at time t and

l is not surrounded by opponent stones,

then it is legal to play a stone at location l at time t.

(这有时被称为不自杀规则。)最后,它尝试 G,在这种情况下,G 被证明是合法的。在挠头了一会儿并尝试了更多实验之后,它决定假设 G 是可以的,即使它被包围了,因为它在 D 处捕获了白子,因此立即解除了包围。

(This is sometimes called the no suicide rule.) Finally, it tries G, which in this case turns out to be legal. After scratching its head for a while and perhaps trying a few more experiments, it settles on the hypothesis that G is OK, even though it is surrounded, because it captures the white stone at D and therefore becomes un-surrounded immediately.

从规则的逐步发展可以看出,学习是通过对假设进行一系列修改来适应观察到的例子。这是学习算法可以轻松做到的事情。机器学习研究人员设计了各种巧妙的算法来快速找到好的假设。在这里,算法在表示围棋规则的逻辑表达式空间中搜索,但假设也可以是代表物理定律的代数表达式、代表疾病和症状的概率贝叶斯网络,甚至是代表其他机器复杂行为的计算机程序。

As you can see from the gradual progression of rules, learning takes place by a sequence of modifications to the hypothesis so as to fit the observed examples. This is something a learning algorithm can do easily. Machine learning researchers have designed all sorts of ingenious algorithms for finding good hypotheses quickly. Here the algorithm is searching in the space of logical expressions representing Go rules, but the hypotheses could also be algebraic expressions representing physical laws, probabilistic Bayesian networks representing diseases and symptoms, or even computer programs representing the complicated behavior of some other machine.

第二个要点是,即使是好的假设也可能是错误的:事实上,上述假设错误的,即使在修正以确保 G 合法之后也是如此。它需要包括不重复规则——例如,如果白棋刚刚在 D 处下棋吃掉了 G 处的黑棋,黑棋就不能在 G 处下棋重新吃掉黑棋,因为这会再次产生相同的局面。请注意,这条规则与程序迄今为止所学到的知识大相径庭,因为它意味着不能根据当前位置确定合法性;相反,还必须记住先前的位置。

A second important point is that even good hypotheses can be wrong: in fact, the hypothesis given above is wrong, even after fixing it to ensure that G is legal. It needs to include the ko or no-repetition rule—for example, if White had just captured a black stone at G by playing at D, Black may not recapture by playing at G, since that produces the same position again. Notice that this rule is a radical departure from what the program has learned so far, because it means that legality cannot be determined from the current position; instead, one also has to remember previous positions.

苏格兰哲学家大卫·休谟于 1748 年指出,归纳推理——即从特定观察推理到一般原则——永远无法保证。1现代统计学习理论中,我们要求的不是完全正确的保证,而只是保证所发现的假设可能大致正确。2学习算法可能会“不幸”看到不具代表性的样本——例如,它可能永远不会尝试像 G 这样的举动,认为这是非法的。它也可能无法预测一些奇怪的边缘情况,例如一些更复杂且很少调用的无重复规则形式所涵盖的情况。3但是,只要宇宙表现出一定程度的规律性,算法就不太可能产生严重错误的假设,因为这样的假设很可能会被其中一个实验“发现”。

The Scottish philosopher David Hume pointed out in 1748 that inductive reasoning—that is, reasoning from particular observations to general principles—can never be guaranteed.1 In the modern theory of statistical learning, we ask not for guarantees of perfect correctness but only for a guarantee that the hypothesis found is probably approximately correct.2 A learning algorithm can be “unlucky” and see an unrepresentative sample—for example, it might never try a move like G, thinking it to be illegal. It can also fail to predict some weird edge cases, such as the ones covered by some of the more complicated and rarely invoked forms of the no-repetition rule.3 But, as long as the universe exhibits some degree of regularity, it’s very unlikely that the algorithm could produce a seriously bad hypothesis, because such a hypothesis would very probably have been “found out” by one of the experiments.

深度学习——这项在媒体上引起人们对人工智能大肆宣传的技术——主要是一种监督学习。它代表了人工智能近几十年来最重大的进步之一,因此值得了解它的工作原理。此外,一些研究人员认为,它将在几年内催生出人类级别的人工智能系统,因此最好评估一下这是否可能成真。

Deep learning—the technology causing all the hullabaloo about AI in the media—is primarily a form of supervised learning. It represents one of the most significant advances in AI in recent decades, so it’s worth understanding how it works. Moreover, some researchers believe it will lead to human-level AI systems within a few years, so it’s a good idea to assess whether that’s likely to be true.

在特定任务的背景下理解深度学习是最容易的,例如学习区分长颈鹿和骆驼。给定一些带标签的照片,学习算法必须形成一个假设,使其能够对未标记的图像进行分类。从计算机的角度来看,图像只不过是一张巨大的数字表,每个数字对应于图像一个像素的三个 RGB 值之一。因此,我们需要一个长颈鹿-骆驼假设,以数字表为输入,预测类别(长颈鹿或骆驼),而不是围棋假设,即以棋盘位置和走法作为输入,并决定这一走法是否合法。

It’s easiest to understand deep learning in the context of a particular task, such as learning to distinguish giraffes and llamas. Given some labeled photographs of each, the learning algorithm has to form a hypothesis that allows it to classify unlabeled images. An image is, from the computer’s point of view, nothing but a large table of numbers, with each number corresponding to one of three RGB values for one pixel of the image. So, instead of a Go hypothesis that takes a board position and a move as input and decides whether the move is legal, we need a giraffe–llama hypothesis that takes a table of numbers as input and predicts a category (giraffe or llama).

现在的问题是,什么样的假设?在过去五十多年的计算机视觉研究中,人们尝试了许多方法。目前最受欢迎的是深度卷积网络。让我来解释一下:它之所以被称为网络,是因为它代表了一个复杂的数学表达式,以规则的方式由许多较小的子表达式,而组合结构具有网络形式。(这种网络通常被称为神经网络,因为它们的设计者从大脑中的神经元网络中汲取灵感。)它被称为卷积,因为这是一种奇特的数学方式,表示网络结构在整个输入图像中以固定模式重复自身。它被称为深度,因为这种网络通常有很多层,也因为它听起来令人印象深刻且有点怪异。

Now the question is, what sort of hypothesis? Over the last fifty-odd years of computer vision research, many approaches have been tried. The current favorite is a deep convolutional network. Let me unpack this: It’s called a network because it represents a complex mathematical expression composed in a regular way from many smaller subexpressions, and the compositional structure has the form of a network. (Such networks are often called neural networks because their designers draw inspiration from the networks of neurons in the brain.) It’s called convolutional because that’s a fancy mathematical way to say that the network structure repeats itself in a fixed pattern across the whole input image. And it’s called deep because such networks typically have many layers, and also because it sounds impressive and slightly spooky.

图 23:(左)用于识别图像中物体的深度卷积网络的简化描述。图像像素值在左侧输入,网络在最右侧的两个节点输出值,表示图像是骆驼或长颈鹿的可能性。请注意第一层中暗线表示的局部连接模式如何在整个层中重复;(右)网络中的节点之一。每个传入值都有一个可调整的权重,以便节点对其给予更多或更少的关注。然后,总传入信号经过门控函数,该函数允许大信号通过但抑制小信号。

FIGURE 23: (left) A simplified depiction of a deep convolutional network for recognizing objects in images. The image pixel values are fed in at the left and the network outputs values at the two rightmost nodes, indicating how likely the image is to be a llama or a giraffe. Notice how the pattern of local connections, indicated by the dark lines in the first layer, repeats across the whole layer; (right) one of the nodes in the network. There is an adjustable weight on each incoming value so that the node pays more or less attention to it. Then the total incoming signal goes through a gating function that allows large signals through but suppresses small ones.

图 23显示了简化的示例(之所以简化,是因为实际网络可能有数百层和数百万个节点)。网络实际上是一个复杂、可调整的数学表达式的图。网络中的每个节点都对应一个简单的可调整表达式,如图所示。通过更改每个输入的权重来进行调整,如“音量控制”所示。然后,输入的加权和在到达节点的输出端之前通过门控函数;通常,门控函数会抑制较小的值并允许较大的值通过。

A simplified example (simplified because real networks may have hundreds of layers and millions of nodes) is shown in figure 23. The network is really a picture of a complex, adjustable mathematical expression. Each node in the network corresponds to a simple adjustable expression, as illustrated in the figure. Adjustments are made by changing the weights on each input, as indicated by the “volume controls.” The weighted sum of the inputs is then passed through a gating function before reaching the output side of the node; typically, the gating function suppresses small values and allows larger ones through.

网络中的学习只需调整所有音量控制旋钮即可减少标记示例的预测误差。就是这么简单:没有魔法,没有特别巧妙的算法。找出如何转动旋钮以减少误差是微积分的直接应用,用于计算改变每个权重将如何改变输出层的误差。这导致了一个简单的公式,用于将误差从输出层反向传播到输入层,沿途调整旋钮。

Learning takes place in the network simply by adjusting all the volume control knobs to reduce the prediction error on the labeled examples. It’s as simple as that: no magic, no especially ingenious algorithms. Working out which way to turn the knobs to decrease the error is a straightforward application of calculus to compute how changing each weight would change the error at the output layer. This leads to a simple formula for propagating the error backwards from the output layer to the input layer, tweaking knobs along the way.

奇迹般地,这个过程奏效了。在识别照片中的物体这一任务中,深度学习算法表现出了非凡的性能。这种效果首次出现在 2012 年的 ImageNet 竞赛中,该竞赛提供了由 1000 个类别中的 120 万张带标签图像组成的训练数据,然后要求算法标记十万张新图像。4杰夫·辛顿 ( Geoff Hinton) 是 20 世纪 80 年代第一次神经网络革命的先锋,他一直在试验一个非常大的深度卷积网络:650,000 个节点和 6000 万个参数。他和多伦多大学的团队将 ImageNet 的错误率降至 15%,比之前 26% 的最佳水平有了显著的提高。5到 2015 年,十个团队都在使用深度学习方法,错误率降至 5%,与花费数周时间学习识别测试中的一千个类别的人的错误率相当。62017 年,机器的错误率为 2%。

Miraculously, the process works. For the task of recognizing objects in photographs, deep learning algorithms have demonstrated remarkable performance. The first inkling of this came in the 2012 ImageNet competition, which provides training data consisting of 1.2 million labeled images in one thousand categories, and then requires the algorithm to label one hundred thousand new images.4 Geoff Hinton, a British computational psychologist who was at the forefront of the first neural network revolution in the 1980s, had been experimenting with a very large deep convolutional network: 650,000 nodes and 60 million parameters. He and his group at the University of Toronto achieved an ImageNet error rate of 15 percent, a dramatic improvement on the previous best of 26 percent.5 By 2015, dozens of teams were using deep learning methods and the error rate was down to 5 percent, comparable to that of a human who had spent weeks learning to recognize the thousand categories in the test.6 By 2017, the machine error rate was 2 percent.

大致在同一时期,基于类似方法的语音识别和机器翻译也取得了类似的进步。总的来说,这是人工智能最重要的三个应用领域。深度学习在强化学习的应用中也发挥了重要作用——例如,学习 AlphaGo 用来估计未来可能位置的可取性的评估函数,并学习复杂机器人行为的控制器。

Over roughly the same period, there have been comparable improvements in speech recognition and machine translation based on similar methods. Taken together, these are three of the most important application areas for AI. Deep learning has also played an important role in applications of reinforcement learning—for example, in learning the evaluation function that AlphaGo uses to estimate the desirability of possible future positions, and in learning controllers for complex robotic behaviors.

到目前为止,我们对深度学习为何如此有效还知之甚少。可能最好的解释是深度网络之所以很深:因为它们有很多层,所以每一层都可以学习从输入到输出的相当简单的转换,而许多这样的简单转换加起来就是从照片到类别标签所需的复杂转换。此外,用于视觉的深度网络具有内置结构,可以强制平移不变性和尺度不变性——这意味着无论狗出现在图像中的什么位置,无论它在图像中看起来有多大,它都是狗。

As yet, we have very little understanding as to why deep learning works as well as it does. Possibly the best explanation is that deep networks are deep: because they have many layers, each layer can learn a fairly simple transformation from its inputs to its outputs, while many such simple transformations add up to the complex transformation required to go from a photograph to a category label. In addition, deep networks for vision have built-in structure that enforces translation invariance and scale invariance—meaning that a dog is a dog no matter where it appears in the image and no matter how big it appears in the image.

深度网络的另一个重要特性是,它们似乎经常能发现捕捉图像基本特征的内部表示,例如眼睛、条纹和简单形状。这些特征都不是内置的。我们知道它们存在,因为我们可以对训练好的网络进行实验,看看哪些类型的数据会导致内部节点(通常是靠近输出层的节点)亮起来。事实上,可以以不同的方式运行学习算法,这样它就可以调整图像本身,在选定的内部节点产生更强的响应。重复这个过程多次,就会产生现在被称为深度梦境初始主义的图像,如图 24中所示。初始主义本身已经成为一种艺术形式,它产生的图像与任何人类艺术都不一样。

Another important property of deep networks is that they often seem to discover internal representations that capture elementary features of images, such as eyes, stripes, and simple shapes. None of these features are built in. We know they are there because we can experiment with the trained network and see what kinds of data cause the internal nodes (typically those close to the output layer) to light up. In fact, it is possible to run the learning algorithm a different way so that it adjusts the image itself to produce a stronger response at chosen internal nodes. Repeating this process many times produces what are now known as deep dreaming or inceptionism images, such as the one in figure 24.7 Inceptionism has become an art form in itself, producing images unlike any human art.

尽管取得了显著成就,但我们目前所理解的深度学习系统还远远不能为一般智能系统提供基础。它们的主要弱点在于它们是电路;它们是命题逻辑和贝叶斯网络的近亲,尽管它们具有许多奇妙的特性,但也缺乏以简洁的方式表达复杂知识形式的能力。这意味着以“本机模式”运行的深度网络需要大量电路来表示相当简单的一般知识。这又意味着要学习大量的权重,因此需要大量的例子——比宇宙所能提供的还要多。

For all their remarkable achievements, deep learning systems as we currently understand them are far from providing a basis for generally intelligent systems. Their principal weakness is that they are circuits; they are cousins of propositional logic and Bayesian networks, which, for all their wonderful properties, also lack the ability to express complex forms of knowledge in a concise way. This means that deep networks operating in “native mode” require vast amounts of circuitry to represent fairly simple kinds of general knowledge. That, in turn, implies vast numbers of weights to learn and hence a need for unreasonable numbers of examples—more than the universe could ever supply.

图 24:Google DeepDream 软件生成的图像。

FIGURE 24: An image generated by Google’s DeepDream software.

有人认为,大脑也是由电路构成的,神经元就是电路元件;因此,电路可以支持人类水平的智能。这是真的,但只是在与大脑由原子构成的相同意义上:原子确实可以支持人类水平的智能,但这并不意味着仅仅将大量原子收集在一起就能产生智能。原子必须以某种方式排列。同样,电路也必须以某种方式排列。计算机也是由电路构成的,既包括其内存中,也包括其处理单元中;但这些电路必须以某种方式排列,并且必须添加多层软件,计算机才能支持高级编程语言和逻辑推理系统的运行。然而,目前没有迹象表明深度学习系统可以自行开发这种能力——要求它们这样做也没有科学意义。

Some argue that the brain is also made of circuits, with neurons as the circuit elements; therefore, circuits can support human-level intelligence. This is true, but only in the same sense that brains are made of atoms: atoms can indeed support human-level intelligence, but that doesn’t mean that just collecting together lots of atoms will produce intelligence. The atoms have to be arranged in certain ways. By the same token, the circuits have to be arranged in certain ways. Computers are also made of circuits, both in their memories and in their processing units; but those circuits have to be arranged in certain ways, and layers of software have to be added, before the computer can support the operation of high-level programming languages and logical reasoning systems. At present, however, there is no sign that deep learning systems can develop such capabilities by themselves—nor does it make scientific sense to require them to do so.

还有更多理由认为深度学习可能会达到远低于通用智能的水平,但我的目的不是诊断所有问题:深度学习社区内部和外部的其他人已经指出了许多问题。关键在于,仅仅创建更大更深的网络、更大的数据集和更大的机器并不足以创造人类水平的人工智能。我们已经看到(在附录 B 中)DeepMind 首席执行官 Demis Hassabis 的观点,即“高级思维和符号推理”对人工智能至关重要。另一位著名的深度学习专家 François Chollet 这样说道:10 “许多应用完全超出了目前的深度学习技术的能力范围——即使有大量人工注释的数据……我们需要摆脱简单的输入到输出映射,转向推理和抽象。”

There are further reasons to think that deep learning may reach a plateau well short of general intelligence, but it’s not my purpose here to diagnose all the problems: others, both inside8 and outside9 the deep learning community, have noted many of them. The point is that simply creating larger and deeper networks and larger data sets and bigger machines is not enough to create human-level AI. We have already seen (in Appendix B) DeepMind CEO Demis Hassabis’s view that “higher-level thinking and symbolic reasoning” are essential for AI. Another prominent deep learning expert, François Chollet, put it this way:10 “Many more applications are completely out of reach for current deep learning techniques—even given vast amounts of human-annotated data. . . . We need to move away from straightforward input-to-output mappings, and on to reasoning and abstraction.”

从思考中学习

Learning from thinking

每当你发现自己不得不思考某件事时,那是因为你还不知道答案。当有人问你新买的手机号码时,你可能不知道。你会想,“好吧,我不知道;那我怎么找到它?”你不是手机的奴隶,所以你不知道如何找到它。你会想,“我怎么才能找到它?”你对此有一个通用的答案:“可能他们把它放在用户容易找到的地方。”(当然,你可能错了。)明显的地方是主屏幕的顶部(不是那里)、电话应用程序内或该应用程序的设置中。你尝试设置>电话,它就在那里。

Whenever you find yourself having to think about something, it’s because you don’t already know the answer. When someone asks for the number of your brand-new cell phone, you probably don’t know it. You think to yourself, “OK, I don’t know it; so how do I find it?” Not being a slave to the cell phone, you don’t know how to find it. You think to yourself, “How do I figure out how to find it?” You have a generic answer to this: “Probably they put it somewhere that’s easy for users to find.” (Of course, you could be wrong about this.) Obvious places would be at the top of the home screen (not there), inside the Phone app, or in Settings for that app. You try Settings>Phone, and there it is.

下次有人问你的电话号码时,你要么知道电话号码,要么确切知道如何获取。你记住了该过程,不仅记住了这次要用到手机,还记住了所有类似手机在所有场合要用到的手机——也就是说,你存储并重复使用了该问题的通用解决方案。这种通用化是合理的,因为你明白这部手机的具体情况和这个特定场合无关紧要。如果这种方法只在星期二对以 17 结尾的电话号码有效,那你会感到震惊。

The next time you are asked for your number, you either know it or you know exactly how to get it. You remember the procedure, not just for this phone on this occasion but for all similar phones on all occasions—that is, you store and reuse a generalized solution to the problem. The generalization is justified because you understand that the specifics of this particular phone and this particular occasion are irrelevant. You would be shocked if the method worked only on Tuesdays for phone numbers ending in 17.

图 25 :围棋中的梯子概念。(a)黑棋威胁要吃掉白棋的棋子。(b)白棋试图逃跑。(c)黑棋挡住了逃跑的方向。(d)白棋试图向另一个方向逃跑。(e)按照数字指示的顺序继续下棋。梯子最终到达棋盘边缘,白棋无处可逃。第 7 步是致命一击:白棋组被完全包围并被击败。

FIGURE 25: The concept of a ladder in Go. (a) Black threatens to capture White’s piece. (b) White tries to escape. (c) Black blocks that direction of escape. (d) White tries the other direction. (e) Play continues in the sequence indicated by the numbers. The ladder eventually reaches the edge of the board, where White has nowhere to run. The coup de grâce is administered by move 7: White’s group is completely surrounded and dies.

围棋提供了同类学习的一个绝佳范例。在图 25 (a) 中,我们看到一种常见的情况,黑棋威胁要通过包围白棋来吃掉白棋的棋子。白棋试图通过添加与原棋子相连的棋子来逃脱,但黑棋继续切断逃脱路线。这种走法形成了一个对角线穿过棋盘的梯子,直到它碰到边缘;那时白棋无路可走。如果你是白棋,你可能不会再犯同样的错误:你会意识到,无论你下白棋还是黑棋,梯子模式最终都会导致被吃掉,无论初始位置和方向如何,在游戏的任何阶段。唯一的例外是梯子碰到一些属于逃脱者的额外棋子。梯子模式的普遍性直接遵循了围棋规则。

Go offers a beautiful example of the same kind of learning. In figure 25(a), we see a common situation where Black threatens to capture White’s stone by surrounding it. White attempts to escape by adding stones connected to the original one, but Black continues to cut off the routes of escape. This pattern of moves forms a ladder of stones diagonally across the board, until it runs into the edge; then White has nowhere to go. If you are White, you probably won’t make the same mistake again: you realize that the ladder pattern always results in eventual capture, for any initial location and any direction, at any stage of the game, whether you are playing White or Black. The only exception occurs when the ladder runs into some additional stones belonging to the escapee. The generality of the ladder pattern follows straightforwardly from the rules of Go.

丢失电话号码的案例和围棋梯子的例子说明了从单个例子​​中学习有效、一般规则的可能性——与深度学习所需的数百万个例子相差甚远。在人工智能中,这种学习被称为基于解释的学习:看到例子后,代理可以向自己解释为什么会出现这样的结果,并且可以通过查看哪些因素对于解释至关重要来提取一般原则。

The case of the missing phone number and the case of the Go ladder illustrate the possibility of learning effective, general rules from a single example—a far cry from the millions of examples needed for deep learning. In AI, this kind of learning is called explanation-based learning: on seeing the example, the agent can explain to itself why it came out that way and can extract the general principle by seeing what factors were essential for the explanation.

严格来说,这个过程本身不会增加新知识——例如,白方可以简单地从围棋规则中推导出一般梯子模式的存在和结果,而无需见过一个例子。11然而,方很可能永远不会在没有见过例子的情况下发现梯子概念;因此,我们可以将基于解释的学习理解为一种以通用方式保存计算结果的强大方法,以避免将来不得不重复相同的推理过程(或因不完善的推理过程而犯同样的错误)。

Strictly speaking, the process does not, by itself, add new knowledge—for example, White could have simply derived the existence and outcome of the general ladder pattern from the rules of Go, without ever seeing an example.11 Chances are, however, that White wouldn’t ever discover the ladder concept without seeing an example of it; so, we can understand explanation-based learning as a powerful method for saving the results of computation in a generalized way, so as to avoid having to recapitulate the same reasoning process (or making the same mistake with an imperfect reasoning process) in the future.

认知科学研究强调了这种学习类型在人类认知中的重要性。它以组块的名义构成了艾伦·纽厄尔极具影响力的认知理论的核心支柱。12 纽厄尔是 1956 年达特茅斯研讨会的与会者之一,也是 1975 年图灵奖的共同获得者,与赫伯·西蒙一起)。它解释了人类如何通过练习变得更加熟练地完成认知任务,因为最初需要思考的各种子任务变得自动化。如果没有它,人类对话将仅限于一两个词的回答,数学家仍然会指望他们的手指。

Research in cognitive science has stressed the importance of this type of learning in human cognition. Under the name of chunking, it forms a central pillar of Allen Newell’s highly influential theory of cognition.12 (Newell was one of the attendees of the 1956 Dartmouth workshop and co-winner of the 1975 Turing Award with Herb Simon.) It explains how humans become more fluent at cognitive tasks with practice, as various subtasks that originally required thinking become automatic. Without it, human conversations would be limited to one- or two-word responses and mathematicians would still be counting on their fingers.

致谢

Acknowledgments

本书的创作得到了许多人的帮助。他们包括我在维京出版社 (Paul Slovak) 和企鹅出版社 (Laura Stickney) 的优秀编辑;我的经纪人 John Brockman,他鼓励我写点东西;Jill Leovy 和 Rob Reid,他们提供了大量有用的反馈;以及早期草稿的其他读者,特别是 Ziyad Marar、Nick Hay、Toby Ord、David Duvenaud、Max Tegmark 和 Grace Cassy。Caroline Jeanmaire 在整理早期读者提出的无数改进建议方面提供了极大的帮助,Martin Fukui 负责收集图像许可。

Many people have helped in the creation of this book. They include my excellent editors at Viking (Paul Slovak) and Penguin (Laura Stickney); my agent, John Brockman, who encouraged me to write something; Jill Leovy and Rob Reid, who provided reams of useful feedback; and other readers of early drafts, especially Ziyad Marar, Nick Hay, Toby Ord, David Duvenaud, Max Tegmark, and Grace Cassy. Caroline Jeanmaire was immensely helpful in collating the innumerable suggestions for improvements made by the early readers, and Martin Fukui handled the collecting of permissions for images.

本书的主要技术思想是与伯克利人类兼容人工智能中心的成员合作开发的,特别是 Tom Griffiths、Anca Dragan、Andrew Critch、Dylan Hadfield-Menell、Rohin Shah 和 Smitha Milli。该中心由执行董事 Mark Nitzberg 和助理主任 Rosie Campbell 出色地领导,并得到了开放慈善基金会的慷慨资助。

The main technical ideas in the book have been developed in collaboration with the members of the Center for Human-Compatible AI at Berkeley, especially Tom Griffiths, Anca Dragan, Andrew Critch, Dylan Hadfield-Menell, Rohin Shah, and Smitha Milli. The Center has been admirably piloted by executive director Mark Nitzberg and assistant director Rosie Campbell, and generously funded by the Open Philanthropy Foundation.

Ramona Alvarez 和 Carine Verdeau 在整个过程中帮助事情顺利进行,而我了不起的妻子 Loy 和我们的孩子 Gordon、Lucy、George 和 Isaac 则提供了大量且必要的爱、忍耐和鼓励,使事情得以完成,尽管并不总是那么顺利。命令。

Ramona Alvarez and Carine Verdeau helped to keep things running throughout the process, and my incredible wife, Loy, and our children—Gordon, Lucy, George, and Isaac—supplied copious and necessary amounts of love, forbearance, and encouragement to finish, not always in that order.

笔记

Notes

第一章

CHAPTER 1

1.我与现任谷歌研究主管 Peter Norvig 合著的人工智能教科书第一版:Stuart Russell 和 Peter Norvig,《人工智能:一种现代方法,第 1 版(Prentice Hall,1995 年)。

1. The first edition of my textbook on AI, co-authored with Peter Norvig, currently director of research at Google: Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 1st ed. (Prentice Hall, 1995).

2. Robinson 开发了归结算法,只要有足够的时间,该算法就可以证明一组一阶逻辑断言的任何逻辑结果。与以前的算法不同,它不需要转换为命题逻辑。J. Alan Robinson,《基于归结原理的面向机器的逻辑》,ACM 杂志》第 12 期(1965 年):23-41。

2. Robinson developed the resolution algorithm, which can, given enough time, prove any logical consequence of a set of first-order logical assertions. Unlike previous algorithms, it did not require conversion to propositional logic. J. Alan Robinson, “A machine-oriented logic based on the resolution principle,” Journal of the ACM 12 (1965): 23–41.

3. Arthur Samuel 是美国计算机时代的先驱,早期在 IBM 工作。描述他在跳棋方面的工作的论文是第一个使用机器学习这一术语的论文,尽管Alan Turing 早在 1947 年就谈到了“能够从经验中学习的机器”。Arthur Samuel,《使用跳棋游戏进行机器学习的一些研究》, IBM 研究与开发杂志3(1959 年):210-29。

3. Arthur Samuel, an American pioneer of the computer era, did his early work at IBM. The paper describing his work on checkers was the first to use the term machine learning, although Alan Turing had already talked about “a machine that can learn from experience” as early as 1947. Arthur Samuel, “Some studies in machine learning using the game of checkers,” IBM Journal of Research and Development 3 (1959): 210–29.

4.众所周知的“莱特希尔报告”导致除爱丁堡大学和苏塞克斯大学之外的大学停止了对人工智能的研究资助:迈克尔·詹姆斯·莱特希尔,《人工智能:一项总体调查》,载于《人工智能:论文研讨会》 (英国科学研究委员会,1973 年)。

4. The “Lighthill Report,” as it became known, led to the termination of research funding for AI except at the universities of Edinburgh and Sussex: Michael James Lighthill, “Artificial intelligence: A general survey,” in Artificial Intelligence: A Paper Symposium (Science Research Council of Great Britain, 1973).

5. CDC 6600 占满了一整个房间,价值相当于 2000 万美元。在那个时代,它的功能非常强大,尽管比 iPhone 的功能弱了一百万倍

5. The CDC 6600 filled an entire room and cost the equivalent of $20 million. For its era it was incredibly powerful, albeit a million times less powerful than an iPhone.

6.在“深蓝”战胜卡斯帕罗夫之后,至少有一位评论员预测,同样的事情还要等一百年才会在围棋上发生:乔治·约翰逊,《要想测试一台强大的计算机,就得玩一场古老的游戏》,纽约时报》 ,1997 年 7 月 29 日。

6. Following Deep Blue’s victory over Kasparov, at least one commentator predicted that it would take one hundred years before the same thing happened in Go: George Johnson, “To test a powerful computer, play an ancient game,” The New York Times, July 29, 1997.

7.有关核技术发展史的可读性很强的资料,请参阅 Richard Rhodes 的《原子弹的诞生》西蒙与舒斯特出版社,1987 年)。

7. For a highly readable history of the development of nuclear technology, see Richard Rhodes, The Making of the Atomic Bomb (Simon & Schuster, 1987).

8.简单的监督学习算法可能不会产生这种效果,除非它被包装在 A/B 测试框架内(这在在线营销环境中很常见)。如果 Bandit 算法和强化学习算法使用用户状态的显式表示或与用户交互历史的隐式表示,它们将产生这种效果。

8. A simple supervised learning algorithm may not have this effect, unless it is wrapped within an A/B testing framework (as is common in online marketing settings). Bandit algorithms and reinforcement learning algorithms will have this effect if they operate with an explicit representation of user state or an implicit representation in terms of the history of interactions with the user.

9.有人认为,追求利润最大化的公司已经是失控的人工实体。例如,请参阅 Charles Stross 的“老兄,你毁了未来!”(主题演讲,第 34 届混沌通信大会,2017 年)。另请参阅 Ted Chiang 的“硅谷正在变成自己最可怕的恐惧”, Buzzfeed,2017 年 12 月 18 日。Daniel Hillis 的“第一批机器智能”进一步探讨了这一想法,收录于《可能的思维:看待人工智能的 25 种方式》,由 John Brockman 主编(企鹅出版社,2019 年)。

9. Some have argued that profit-maximizing corporations are already out-of-control artificial entities. See, for example, Charles Stross, “Dude, you broke the future!” (keynote, 34th Chaos Communications Congress, 2017). See also Ted Chiang, “Silicon Valley is turning into its own worst fear,” Buzzfeed, December 18, 2017. The idea is explored further by Daniel Hillis, “The first machine intelligences,” in Possible Minds: Twenty-Five Ways of Looking at AI, ed. John Brockman (Penguin Press, 2019).

10.在当时,维纳的论文是一个罕见的例外,打破了“所有技术进步都是好事”的主流观点:诺伯特·维纳,《自动化的一些道德和技术后果》,《科学》 131(1960):1355-58。

10. For its time, Wiener’s paper was a rare exception to the prevailing view that all technological progress was a good thing: Norbert Wiener, “Some moral and technical consequences of automation,” Science 131 (1960): 1355–58.

第二章

CHAPTER 2

1.圣地亚哥·拉蒙·卡哈尔于 1894 年提出突触变化是学习的场所,但直到 20 世纪 60 年代末,这一假设才得到实验证实。参见 Timothy Bliss 和 Terje Lomo 的《刺激穿通通路后,麻醉兔齿状区突触传递的长期增强》,生理学杂志》第 232 期(1973 年):第 331-56 页。

1. Santiago Ramón y Cajal proposed synaptic changes as the site of learning in 1894, but it was not until the late 1960s that this hypothesis was confirmed experimentally. See Timothy Bliss and Terje Lomo, “Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path,” Journal of Physiology 232 (1973): 331–56.

2 .有关简介,请参阅 James Gorman 的“了解我们对大脑的了解有多么少” ,纽约时报,2014 年 11 月 10 日。另请参阅 Tom Siegfried 的“了解大脑还有很长的路要走”,科学新闻,2017 年 7 月 25 日。 《神经元》杂志 2017 年特刊(第 94 卷,第 933-1040 页)很好地概述了了解大脑的许多不同方法。

2. For a brief introduction, see James Gorman, “Learning how little we know about the brain,” The New York Times, November 10, 2014. See also Tom Siegfried, “There’s a long way to go in understanding the brain,” ScienceNews, July 25, 2017. A special 2017 issue of the journal Neuron (vol. 94, pp. 933–1040) provides a good overview of many different approaches to understanding the brain.

3.意识的存在与否(实际的主观体验)肯定会对我们对机器的道德考量产生影响。如果我们有一天能够获得足够的理解来设计有意识的机器,或者发现我们已经这样做了,我们将面临许多我们基本上没有准备的重要道德问题

3. The presence or absence of consciousness—actual subjective experience—certainly makes a difference in our moral consideration for machines. If ever we gain enough understanding to design conscious machines or to detect that we have done so, we would face many important moral issues for which we are largely unprepared.

4.以下论文是第一批明确强化学习算法与神经生理记录之间联系的论文之一:Wolfram Schultz、Peter Dayan 和 P. Read Montague,《预测和奖励的神经基础》,科学》 275(1997):1593-99。

4. The following paper was among the first to make a clear connection between reinforcement learning algorithms and neurophysiological recordings: Wolfram Schultz, Peter Dayan, and P. Read Montague, “A neural substrate of prediction and reward,” Science 275 (1997): 1593–99.

5.人们希望通过颅内刺激找到治疗各种精神疾病的方法。例如,参见罗伯特·希思的《人类大脑的电自我刺激》,美国精神病学杂志》第 120 卷(1963 年):571-77 页。

5. Studies of intracranial stimulation were carried out with the hope of finding cures for various mental illnesses. See, for example, Robert Heath, “Electrical self-stimulation of the brain in man,” American Journal of Psychiatry 120 (1963): 571–77.

6.一个可能因上瘾而面临自我灭绝的物种的例子:Bryson Voirin,《侏儒树懒 Bradypus pygmaeus 的生物学和保护哺乳动物学杂志》 96(2015 年):703-7。

6. An example of a species that may be facing self-extinction via addiction: Bryson Voirin, “Biology and conservation of the pygmy sloth, Bradypus pygmaeus,” Journal of Mammalogy 96 (2015): 703–7.

7.进化论中的鲍德温效应通常归功于以下论文:詹姆斯·鲍德温,《进化论中的一个新因素》,美国博物学家》第 30 卷(1896 年):441–51 页。

7. The Baldwin effect in evolution is usually attributed to the following paper: James Baldwin, “A new factor in evolution,” American Naturalist 30 (1896): 441–51.

8.鲍德温效应的核心思想也出现在以下著作中:康威·劳埃德·摩根,习惯与本能》(爱德华·阿诺德,1896年)。

8. The core idea of the Baldwin effect also appears in the following work: Conwy Lloyd Morgan, Habit and Instinct (Edward Arnold, 1896).

9.展示鲍德温效应的现代分析与计算机实现:Geoffrey Hinton 和 Steven Nowlan,“学习如何引导进化”,复杂系统1(1987):495–502。

9. A modern analysis and computer implementation demonstrating the Baldwin effect: Geoffrey Hinton and Steven Nowlan, “How learning can guide evolution,” Complex Systems 1 (1987): 495–502.

10.包括内部奖励信号回路进化在内的计算机模型对鲍德温效应的进一步阐释:David Ackley 和 Michael Littman,《学习与进化之间的相互作用》,《人工智能 II》 Christopher Langton 等主编(Addison-Wesley,1991 年)。

10. Further elucidation of the Baldwin effect by a computer model that includes the evolution of the internal reward-signaling circuitry: David Ackley and Michael Littman, “Interactions between learning and evolution,” in Artificial Life II, ed. Christopher Langton et al. (Addison-Wesley, 1991).

11这里我所指的是我们今天智力概念的根源,而不是描述古希腊的nous概念,后者有多种相关含义。

11. Here I am pointing to the roots of our present-day concept of intelligence, rather than describing the ancient Greek concept of nous, which had a variety of related meanings.

12.这段引文出自亚里士多德《尼各马可伦理学》第三卷3,1112b。

12. The quotation is taken from Aristotle, Nicomachean Ethics, Book III, 3, 1112b.

13.卡尔达诺是欧洲最早考虑负数的数学家之一,他发展了游戏概率的早期数学处理方法。他于 1576 年去世,比他的著作出版早了 87 年:杰罗拉莫·卡尔达诺,游戏之书》(里昂,1663 年)。

13. Cardano, one of the first European mathematicians to consider negative numbers, developed an early mathematical treatment of probability in games. He died in 1576, eighty-seven years before his work appeared in print: Gerolamo Cardano, Liber de ludo aleae (Lyons, 1663).

14 .阿诺德的作品最初是匿名出版的,通常被称为《皇家港口逻辑》:Antoine Arnauld, La logique, ou l'art de penser (Chez Charles Savreux, 1662)。另见 Blaise Pascal,《Pensées》(Chez Guillaume Desprez,1670)。

14. Arnauld’s work, initially published anonymously, is often called The Port-Royal Logic: Antoine Arnauld, La logique, ou l’art de penser (Chez Charles Savreux, 1662). See also Blaise Pascal, Pensées (Chez Guillaume Desprez, 1670).

15.效用概念:丹尼尔·伯努利,《新方法论》,《圣彼得堡皇家科学院院刊》第 5 卷(1738 年):175-92 页。伯努利的效用概念源于一位商人塞姆普罗尼乌斯,他正在考虑用一艘船运输贵重货物还是将其分成两艘船,假设每艘船在旅途中沉没的概率为 50%。两种解决方案的预期货币价值相同,但塞姆普罗尼乌斯显然更喜欢两艘船的解决方案。

15. The concept of utility: Daniel Bernoulli, “Specimen theoriae novae de mensura sortis,” Proceedings of the St. Petersburg Imperial Academy of Sciences 5 (1738): 175–92. Bernoulli’s idea of utility arises from considering a merchant, Sempronius, choosing whether to transport a valuable cargo in one ship or to split it between two, assuming that each ship has a 50 percent probability of sinking on the journey. The expected monetary value of the two solutions is the same, but Sempronius clearly prefers the two-ship solution.

16.大多数人认为,冯·诺依曼本人并没有发明这种架构,但他的名字出现在一份描述 EDVAC 存储程序计算机的具有影响力的报告的早期草稿上。

16. By most accounts, von Neumann did not himself invent this architecture but his name was on an early draft of an influential report describing the EDVAC stored-program computer.

17.冯·诺依曼和摩根斯坦的著作在许多方面都是现代经济理论的基础:约翰·冯·诺依曼和奥斯卡·摩根斯坦,博弈论与经济行为》(普林斯顿大学出版社,1944 年)。

17. The work of von Neumann and Morgenstern is in many ways the foundation of modern economic theory: John von Neumann and Oskar Morgenstern, Theory of Games and Economic Behavior (Princeton University Press, 1944).

18.效用是折现奖励的总和这一提议是由保罗·萨缪尔森提出的一个数学上方便的假设,《关于效用测量的注释》,经济研究评论》第 4 卷(1937 年):第 155-161 页。如果s 0, s 1, . . . 是一系列状态,则其在该模型中的效用为U ( s 0, s 1, . . ) = ∑ t γ t R ( s t ),其中 γ 是折现因子, R是描述状态可取性的奖励函数。该模型的简单应用很少与真实个体对当前和未来奖励可取性的判断一致。如需详细分析,请参阅 Shane Frederick、George Loewenstein 和 Ted O'Donoghue 的《时间折扣和时间偏好:评论性评论》,《经济文献杂志》 40(2002 年):351-401。

18. The proposal that utility is a sum of discounted rewards was put forward as a mathematically convenient hypothesis by Paul Samuelson, “A note on measurement of utility,” Review of Economic Studies 4 (1937): 155–61. If s0, s1, . . . is a sequence of states, then its utility in this model is U(s0, s1, . . .) = ∑tγtR(st), where γ is a discount factor and R is a reward function describing the desirability of a state. Naïve application of this model seldom agrees with the judgment of real individuals about the desirability of present and future rewards. For a thorough analysis, see Shane Frederick, George Loewenstein, and Ted O’Donoghue, “Time discounting and time preference: A critical review,” Journal of Economic Literature 40 (2002): 351–401.

19 .法国经济学家莫里斯·阿莱 (Maurice Allais) 提出了一种决策场景,其中人类似乎始终违反冯·诺依曼-摩根斯坦公理:莫里斯·阿莱 (Maurice Allais),“Le coportement de l'homme rationnel devant le risque: Critique des postulats et axiomes de l'école américaine”, Econometrica 21 (1953): 503–46。

19. Maurice Allais, a French economist, proposed a decision scenario in which humans appear consistently to violate the von Neumann–Morgenstern axioms: Maurice Allais, “Le comportement de l’homme rationnel devant le risque: Critique des postulats et axiomes de l’école américaine,” Econometrica 21 (1953): 503–46.

20.有关非定量决策分析的介绍,请参阅 Michael Wellman,“定性概率网络的基本概念”,人工智能44(1990):257-303。

20. For an introduction to non-quantitative decision analysis, see Michael Wellman, “Fundamental concepts of qualitative probabilistic networks,” Artificial Intelligence 44 (1990): 257–303.

21.我将在第 9 章进一步讨论人类非理性的证据。标准参考文献包括:Allais,《行为》;Daniel Ellsberg,《风险、模糊性和决策》(哈佛大学博士论文,1962 年);Amos Tversky 和 ​​Daniel Kahneman,《不确定性下的判断:启发式和偏见》,《科学》第 185 卷(1974 年):1124-1131 页。

21. I will discuss the evidence for human irrationality further in Chapter 9. The standard references include the following: Allais, “Le comportement”; Daniel Ellsberg, Risk, Ambiguity, and Decision (PhD thesis, Harvard University, 1962); Amos Tversky and Daniel Kahneman, “Judgment under uncertainty: Heuristics and biases,” Science 185 (1974): 1124–31.

22.显然,这是一个无法在实践中实现的思想实验。关于不同未来的选择从未被详细呈现,人类也从未有时间在做出选择之前仔细审视和品味这些未来。相反,人们只得到简短的总结,比如“图书管理员”或“煤矿工人”。在做出这样的选择时,人们实际上是被要求比较两个完整未来的概率分布,一个从选择“图书管理员”开始,另一个从选择“煤矿工人”开始,每个分布都假设每个人在未来会采取最佳行动。不用说,这并不容易

22. It should be clear that this is a thought experiment that cannot be realized in practice. Choices about different futures are never presented in full detail, and humans never have the luxury of minutely examining and savoring those futures before choosing. Instead, one is given only brief summaries, such as “librarian” or “coal miner.” In making such a choice, one is really being asked to compare two probability distributions over complete futures, one beginning with the choice “librarian” and the other “coal miner,” with each distribution assuming optimal actions on one’s own part within each future. Needless to say, this is not easy.

23.第一次提到游戏随机化策略出现在 Pierre Rémond de Montmort 的Essay d'analyse sur les jeux de hazard,第 2 版(Chez Jacques Quillau,1713 年)。该书指出,某位 Waldegrave 先生是纸牌游戏 Le Her最佳随机化解决方案的来源。David Bellhouse 在《Waldegrave 问题》中透露了 Waldegrave 身份的详细信息,该文章发表于《概率和统计史电子杂志》第 3 期(2007 年)。

23. The first mention of a randomized strategy for games appears in Pierre Rémond de Montmort, Essay d’analyse sur les jeux de hazard, 2nd ed. (Chez Jacques Quillau, 1713). The book identifies a certain Monsieur de Waldegrave as the source of an optimal randomized solution for the card game Le Her. Details of Waldegrave’s identity are revealed by David Bellhouse, “The problem of Waldegrave,” Electronic Journal for History of Probability and Statistics 3 (2007).

24该问题的完整定义是指定爱丽丝在四种情况下得分的概率:当她向鲍勃的右侧射击而他向右或向左扑救时,以及当她向鲍勃的左侧射击而他向右或向左扑救时。在这种情况下,这些概率分别为 25%、70%、65% 和 10%。现在假设爱丽丝的策略是以概率p向鲍勃的右侧射击,以概率 1 − p向鲍勃的左侧射击,而鲍勃以概率q向右侧扑救,以概率 1 − q向左侧扑救。爱丽丝的收益为U A = 0.25 pq + 0.70 p (1 − q ) + 0.65 (1 − p ) q + 0.10(1 − p ) (1 − q ),而鲍勃的收益为U B = − U A。在平衡状态下,∂ U A /∂ p = 0 且 ∂ U B /∂ q = 0,因此p = 0.55 和q = 0.60。

24. The problem is fully defined by specifying the probability that Alice scores in each of four cases: when she shoots to Bob’s right and he dives right or left, and when she shoots to his left and he dives right or left. In this case, these probabilities are 25 percent, 70 percent, 65 percent, and 10 percent respectively. Now suppose that Alice’s strategy is to shoot to Bob’s right with probability p and his left with probability 1 − p, while Bob dives to his right with probability q and left with probability 1 − q. The payoff to Alice is UA = 0.25pq + 0.70 p(1 − q) + 0.65 (1 − p)q + 0.10(1 − p) (1 − q), while Bob’s payoff is UB = −UA. At equilibrium, ∂UA/∂p = 0 and ∂UB/∂q = 0, giving p = 0.55 and q = 0.60.

25.最初的博弈论问题是由兰德公司的梅里尔·弗勒德和梅尔文·德雷舍提出的;塔克在访问他们的办公室时看到了收益矩阵,并提出了一个与之相符的“故事”。

25. The original game-theoretic problem was introduced by Merrill Flood and Melvin Dresher at the RAND Corporation; Tucker saw the payoff matrix on a visit to their offices and proposed a “story” to go along with it.

26.博弈论者通常认为,爱丽丝和鲍勃可以相互合作(拒绝交谈),也可以背叛并出卖他们的同伙。我发现这种语言令人困惑,因为“相互合作”不是每个代理人可以单独做出的选择而且在普通用语中,人们经常谈论与警方合作,通过合作获得较轻的刑罚,等等。

26. Game theorists typically say that Alice and Bob could cooperate with each other (refuse to talk) or defect and rat on their accomplice. I find this language confusing, because “cooperate with each other” is not a choice that each agent can make separately, and because in common parlance one often talks about cooperating with the police, receiving a lighter sentence in return for cooperating, and so on.

27.有关囚徒困境和其他博弈中基于信任的有趣解决方案,请参阅 Joshua Letchford、Vincent Conitzer 和 Kamal Jain 合著的《双人完美信息游戏的‘道德’博弈论解决方案概念》,载于《第四届网络和互联网经济国际研讨会论文集》,由 Christos Papadimitriou 和 Shuzhong Zhang 编辑(Springer,2008 年)。

27. For an interesting trust-based solution to the prisoner’s dilemma and other games, see Joshua Letchford, Vincent Conitzer, and Kamal Jain, “An ‘ethical’ game-theoretic solution concept for two-player perfect-information games,” in Proceedings of the 4th International Workshop on Web and Internet Economics, ed. Christos Papadimitriou and Shuzhong Zhang (Springer, 2008).

28.公地悲剧的起源:威廉·福斯特·劳埃德,《人口制约的两次演讲》(牛津大学,1833年)。

28. Origin of the tragedy of the commons: William Forster Lloyd, Two Lectures on the Checks to Population (Oxford University, 1833).

29.全球生态背景下该主题的现代复兴:加勒特·哈丁,《公地悲剧》,《科学 162(1968):1243-48。

29. Modern revival of the topic in the context of global ecology: Garrett Hardin, “The tragedy of the commons,” Science 162 (1968): 1243–48.

30 .即使我们试图用化学反应或生物细胞制造智能机器,这些组装体也很有可能只是图灵机在非传统材料中的实现。一个物体是否是通用计算机与它由什么制成无关。

30. It’s quite possible that even if we had tried to build intelligent machines from chemical reactions or biological cells, those assemblages would have turned out to be implementations of Turing machines in nontraditional materials. Whether an object is a general-purpose computer has nothing to do with what it’s made of.

31.图灵的突破性论文定义了现在的图灵机这是现代计算机科学的基础。标题中的判定问题决策问题是指一阶逻辑中判定蕴涵的问题:艾伦·图灵,《论可计算数及其在判定问题中的应用》,《伦敦数学学会会刊》,第二辑,42(1936):230-65。

31. Turing’s breakthrough paper defined what is now known as the Turing machine, the basis for modern computer science. The Entscheidungsproblem, or decision problem, in the title is the problem of deciding entailment in first-order logic: Alan Turing, “On computable numbers, with an application to the Entscheidungsproblem,” Proceedings of the London Mathematical Society, 2nd ser., 42 (1936): 230–65.

32.负电容研究的一项很好的调查,其发明者之一 Sayeef Salahuddin 撰写了《负电容晶体管回顾》,载国际 VLSI 技术、系统和应用研讨会(IEEE Press,2016 年)。

32. A good survey of research on negative capacitance by one of its inventors: Sayeef Salahuddin, “Review of negative capacitance transistors,” in International Symposium on VLSI Technology, Systems and Application (IEEE Press, 2016).

33.有关量子计算的更详细解释,请参阅 Scott Aaronson 的《德谟克利特以来的量子计算》(剑桥大学出版社,2013 年)。

33. For a much better explanation of quantum computation, see Scott Aaronson, Quantum Computing since Democritus (Cambridge University Press, 2013).

34.在复杂性理论方面明确区分经典计算和量子计算的论文:Ethan Bernstein 和 Umesh Vazirani,《量子复杂性理论》, SIAM计算杂志26(1997):1411-73。

34. The paper that established a clear complexity-theoretic distinction between classical and quantum computation: Ethan Bernstein and Umesh Vazirani, “Quantum complexity theory,” SIAM Journal on Computing 26 (1997): 1411–73.

35.以下由著名物理学家撰写的文章很好地介绍了当前的理解和技术水平:John Preskill,“NISQ 时代及以后的量子计算”,arXiv:1801.00862 (2018)。

35. The following article by a renowned physicist provides a good introduction to the current state of understanding and technology: John Preskill, “Quantum computing in the NISQ era and beyond,” arXiv:1801.00862 (2018).

36.关于一公斤物体的最大计算能力:Seth Lloyd,“计算的最终物理极限”,自然》 406(2000):1047-1054。

36. On the maximum computational ability of a one-kilogram object: Seth Lloyd, “Ultimate physical limits to computation,” Nature 406 (2000): 1047–54.

37.有关人类可能是物理上可实现的智能的巅峰这一观点的例子,请参阅凯文·凯利在《连线》杂志 2017 年 4 月 25 日发表的“超人 AI神话”一文中的观点:“我们倾向于相信极限远远超出我们,远远‘高于’我们,就像我们‘高于’一只蚂蚁一样……我们有什么证据表明极限不是我们?”

37. For an example of the suggestion that humans may be the pinnacle of physically achievable intelligence, see Kevin Kelly, “The myth of a superhuman AI,” Wired, April 25, 2017: “We tend to believe that the limit is way beyond us, way ‘above’ us, as we are ‘above’ an ant. . . . What evidence do we have that the limit is not us?”

38.如果你想了解解决停机问题的简单技巧:显而易见的方法是运行程序以查看它是否完成,但这种方法行不通,因为该方法不一定能完成。你可能等了一百万年,仍然不知道程序是真的陷入了无限循环还是只是在慢慢运行。

38. In case you are wondering about a simple trick to solve the halting problem: the obvious method of just running the program to see if it finishes doesn’t work, because that method doesn’t necessarily finish. You might wait a million years and still not know if the program is really stuck in an infinite loop or just taking its time.

39.证明停机问题不可判定是一个巧妙的诡计。问题是:是否存在一个 LoopChecker( P , X ) 程序,对于任何程序P任何输入X,它都能在有限的时间内正确判定应用于输入 X 的 P 是否停机产生结果还是永远保持运行?假设 LoopChecker 存在。现在编写一个程序Q,它将 LoopChecker 作为子程序调用,以Q本身和X作为输入,然后执行与 LoopChecker( Q , X ) 预测相反的操作。因此,如果 LoopChecker 说Q停机, Q就不会停机,反之亦然。因此,LoopChecker 存在的假设会导致矛盾,所以 LoopChecker 不可能存在。

39. The proof that the halting problem is undecidable is an elegant piece of trickery. The question: Is there a LoopChecker(P,X) program that, for any program P and any input X, decides correctly, in finite time, whether P applied to input X will halt and produce a result or keep chugging away forever? Suppose that LoopChecker exists. Now write a program Q that calls LoopChecker as a subroutine, with Q itself and X as inputs, and then does the opposite of what LoopChecker(Q,X) predicts. So, if LoopChecker says that Q halts, Q doesn’t halt, and vice versa. Thus, the assumption that LoopChecker exists leads to a contradiction, so LoopChecker cannot exist.

40我说“似乎”是因为,到目前为止,NP 完全问题需要超多项式时间(通常称为 P ≠ NP)的说法仍然是一个未经证实的猜想。然而,经过近五十年的研究,几乎所有数学家和计算机科学家都相信这个说法是正确的。

40. I say “appear” because, as yet, the claim that the class of NP-complete problems requires superpolynomial time (usually referred to as P ≠ NP) is still an unproven conjecture. After almost fifty years of research, however, nearly all mathematicians and computer scientists are convinced the claim is true.

41.洛夫莱斯关于计算的著作主要出现在她翻译的意大利工程师对巴贝奇引擎的评论的注释中:LF Menabrea,《查尔斯·巴贝奇发明的分析机草图》,由洛夫莱斯伯爵夫人阿达译,《科学回忆录》第III,R. Taylor 编辑(R. 和 JE Taylor,1843 年)。Menabrea 的原始文章以法语撰写,基于巴贝奇 1840 年的演讲,出现在日内瓦世界图书馆82(1842 年)中。

41. Lovelace’s writings on computation appear mainly in her notes attached to her translation of an Italian engineer’s commentary on Babbage’s engine: L. F. Menabrea, “Sketch of the Analytical Engine invented by Charles Babbage,” trans. Ada, Countess of Lovelace, in Scientific Memoirs, vol. III, ed. R. Taylor (R. and J. E. Taylor, 1843). Menabrea’s original article, written in French and based on lectures given by Babbage in 1840, appears in Bibliothèque Universelle de Genève 82 (1842).

42.关于人工智能可能性的早期开创性论文之一:阿兰·图灵,《计算机器和智能》, Mind 59(1950):433–60。

42. One of the seminal early papers on the possibility of artificial intelligence: Alan Turing, “Computing machinery and intelligence,” Mind 59 (1950): 433–60.

43。SRI的 Shakey 项目由其领导者之一 Nils Nilsson 在回顾中总结:《Shakey 机器人》,技术笔记 323(SRI International,1984 年)。1969 年制作了一部 24 分钟的电影《 SHAKEY:机器人学习和规划实验》,引起了全国的关注。

43. The Shakey project at SRI is summarized in a retrospective by one of its leaders: Nils Nilsson, “Shakey the robot,” technical note 323 (SRI International, 1984). A twenty-four-minute film, SHAKEY: Experimentation in Robot Learning and Planning, was made in 1969 and garnered national attention.

44.标志着现代基于概率的人工智能开始的书籍:Judea Pearl,《智能系统中的概率推理:合理推理网络》(Morgan Kaufmann,1988 年)。

44. The book that marked the beginning of modern, probability-based AI: Judea Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (Morgan Kaufmann, 1988).

45从技术上讲,国际象棋并不是完全可观察的。程序确实需要记住少量信息来确定王车易位和过路兵的合法性,并通​​过重复或五十步规则来定义平局。

45. Technically, chess is not fully observable. A program does need to remember a small amount of information to determine the legality of castling and en passant moves and to define draws by repetition or by the fifty-move rule.

46.有关完整阐述,请参阅 Stuart Russell 和 Peter Norvig 编著的《人工智能:一种现代方法》第 2 章,3 版(Pearson,2010 年)。

46. For a complete exposition, see Chapter 2 of Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 3rd ed. (Pearson, 2010).

47。Santiago Ontañon 等人在《星际争霸中的实时战略游戏 AI 研究和竞赛调查》中讨论了星际争霸的状态空间大小, IEEE 计算智能与游戏 AI 交易5(2013 年):293–311。由于玩家可以同时移动所有单位,因此可以进行大量移动。由于对一次可以移动的单位或单位组数量施加了限制,因此移动次数会减少。

47. The size of the state space for StarCraft is discussed by Santiago Ontañon et al., “A survey of real-time strategy game AI research and competition in StarCraft,” IEEE Transactions on Computational Intelligence and AI in Games 5 (2013): 293–311. Vast numbers of moves are possible because a player can move all units simultaneously. The numbers go down as restrictions are imposed on how many units or groups of units can be moved at once.

48.关于《星际争霸》中的人机竞争:Tom Simonite,《DeepMind 在《星际争霸》中击败职业选手,机器人再次获胜》 , 《连线》,2019 年 1 月 25 日。

48. On human–machine competition in StarCraft: Tom Simonite, “DeepMind beats pros at StarCraft in another triumph for bots,” Wired, January 25, 2019.

49. AlphaZero 由 David Silver 等人描述,“通过通用强化学习算法自学掌握国际象棋和将棋”,arXiv:1712.01815 (2017)。

49. AlphaZero is described by David Silver et al., “Mastering chess and shogi by self-play with a general reinforcement learning algorithm,” arXiv:1712.01815 (2017).

50.使用 A* 算法及其众多衍生算法可以找到图中的最佳路径:Peter Hart、Nils Nilsson 和 Bertram Raphael,《启发式确定最低成本路径的正式基础》, IEEE系统科学和控制论学报SSC-4(1968 年):100-107。

50. Optimal paths in graphs are found using the A* algorithm and its many descendants: Peter Hart, Nils Nilsson, and Bertram Raphael, “A formal basis for the heuristic determination of minimum cost paths,” IEEE Transactions on Systems Science and Cybernetics SSC-4 (1968): 100–107.

51.介绍 Advice Taker 程序和基于逻辑的知识系统的论文:John McCarthy,《具有常识的程序》,载于《思维过程机械化研讨会论文集》(女王陛下文具办公室,1958 年)。

51. The paper that introduced the Advice Taker program and logic-based knowledge systems: John McCarthy, “Programs with common sense,” in Proceedings of the Symposium on Mechanisation of Thought Processes (Her Majesty’s Stationery Office, 1958).

52.要了解知识型系统的重要性,请考虑数据库系统。数据库包含具体的、个别的事实,例如我的钥匙的位置和你的 Facebook 好友的身份。数据库系统无法存储一般规则,例如国际象棋规则或英国公民身份的法律定义。它们可以计算出有多少个叫 Alice 的人有叫 Bob 的朋友,但无法确定某个 Alice 是否符合英国公民身份的条件,或者棋盘上的特定走法是否会导致将军。数据库系统无法将两部分知识结合起来产生第三部分知识:它们支持记忆但不支持推理。(确实,许多现代数据库系统提供了添加规则的方法和使用这些规则来得出新事实的方法;就它们所做的而言,它们实际上是基于知识的系统。)尽管数据库系统是基于知识的系统的高度受限版本,但它却是当今大多数商业活动的基础,每年创造数千亿美元的价值。

52. To get some sense of the significance of knowledge-based systems, consider database systems. A database contains concrete, individual facts, such as the location of my keys and the identities of your Facebook friends. Database systems cannot store general rules, such as the rules of chess or the legal definition of British citizenship. They can count how many people called Alice have friends called Bob, but they cannot determine whether a particular Alice meets the conditions for British citizenship or whether a particular sequence of moves on a chessboard will lead to checkmate. Database systems cannot combine two pieces of knowledge to produce a third: they support memory but not reasoning. (It is true that many modern database systems provide a way to add rules and a way to use those rules to derive new facts; to the extent that they do, they are really knowledge-based systems.) Despite being highly constricted versions of knowledge-based systems, database systems underlie most of present-day commercial activity and generate hundreds of billions of dollars in value every year.

53 .描述一阶逻辑完备性定理的原始论文:Kurt Gödel, “Die Vollständigkeit der Axiome des logischen Funktionenkalküls,” Monatshefte für Mathematik 37 (1930): 349–60。

53. The original paper describing the completeness theorem for first-order logic: Kurt Gödel, “Die Vollständigkeit der Axiome des logischen Funktionenkalküls,” Monatshefte für Mathematik 37 (1930): 349–60.

54一阶逻辑的推理算法确实存在缺陷:如果没有答案——也就是说,如果现有的知识不足以给出答案——那么算法可能永远无法完成。这是不可避免的:从数学上讲,正确的算法不可能总是以“不知道”结束,原因与没有算法可以解决停机问题(本页)基本相同。

54. The reasoning algorithm for first-order logic does have a gap: if there is no answer—that is, if the available knowledge is insufficient to give an answer either way—then the algorithm may never finish. This is unavoidable: it is mathematically impossible for a correct algorithm always to terminate with “don’t know,” for essentially the same reason that no algorithm can solve the halting problem (this page).

55.第一个一阶逻辑定理证明算法是通过将一阶句子简化为(大量)命题句子来实现的:Martin Davis 和 Hilary Putnam,《量化理论的计算程序》, Journal of the ACM 7 (1960): 201–15。Robinson 的解析算法直接在一阶上进行操作逻辑句子,使用“统一”来匹配包含逻辑变量的复杂表达式:J. Alan Robinson,“基于解析原理的面向机器的逻辑”,Journal of the ACM 12(1965):23-41。

55. The first algorithm for theorem-proving in first-order logic worked by reducing first-order sentences to (very large numbers of) propositional sentences: Martin Davis and Hilary Putnam, “A computing procedure for quantification theory,” Journal of the ACM 7 (1960): 201–15. Robinson’s resolution algorithm operated directly on first-order logical sentences, using “unification” to match complex expressions containing logical variables: J. Alan Robinson, “A machine-oriented logic based on the resolution principle,” Journal of the ACM 12 (1965): 23–41.

56.人们可能想知道,逻辑机器人 Shakey 是如何得出关于该做什么的明确结论的。答案很简单:Shakey 的知识库包含错误的断言。例如,Shakey 认为通过执行“将物体 A 从门 D 推入房间 B”,物体 A 最终会进入房间 B。这种信念是错误的,因为 Shakey 可能会被卡在门口或完全错过门口,或者有人可能会偷偷地从 Shakey 手中拿走物体 A。Shakey 的计划执行模块可以检测到计划失败并相应地重新计划,因此严格来说,Shakey 不是一个纯粹的逻辑系统。

56. One might wonder how Shakey the logical robot ever reached any definite conclusions about what to do. The answer is simple: Shakey’s knowledge base contained false assertions. For example, Shakey believed that by executing “push object A through door D into room B,” object A would end up in room B. This belief was false because Shakey could get stuck in the doorway or miss the doorway altogether or someone might sneakily remove object A from Shakey’s grasp. Shakey’s plan execution module could detect plan failure and replan accordingly, so Shakey was not, strictly speaking, a purely logical system.

57 .关于概率在人类思维中的作用的早期评论:Pierre-Simon Laplace, Essai philosophique sur les probabilités(Mme. Ve. Courcier,1814)。

57. An early commentary on the role of probability in human thinking: Pierre-Simon Laplace, Essai philosophique sur les probabilités (Mme. Ve. Courcier, 1814).

58.用非技术性的方式描述贝叶斯逻辑:Stuart Russell,“统一逻辑和概率”, Communications of the ACM 58(2015):88-97 本文大量借鉴了我以前的学生 Brian Milch 的博士论文研究。

58. Bayesian logic described in a fairly nontechnical way: Stuart Russell, “Unifying logic and probability,” Communications of the ACM 58 (2015): 88–97. The paper draws heavily on the PhD thesis research of my former student Brian Milch.

59.贝叶斯定理的原始来源:Thomas Bayes 和 Richard Price,《解决机会主义理论问题的文章》,《伦敦皇家学会哲学学报》 53(1763):370–418。

59. The original source for Bayes’ theorem: Thomas Bayes and Richard Price, “An essay towards solving a problem in the doctrine of chances,” Philosophical Transactions of the Royal Society of London 53 (1763): 370–418.

60.从技术上讲,塞缪尔的程序没有将胜利和失败视为绝对的奖励;通过将材料的价值固定为正值;然而,该程序总体上倾向于朝着胜利的方向努力。

60. Technically, Samuel’s program did not treat winning and losing as absolute rewards; by fixing the value of material to be positive; however, the program generally tended to work towards winning.

61.应用强化学习制作世界一流的西洋双陆棋程序:Gerald Tesauro,《时间差分学习和 TD-Gammon》,《ACM 通讯》 38(1995 年):58-68

61. The application of reinforcement learning to produce a world-class backgammon program: Gerald Tesauro, “Temporal difference learning and TD-Gammon,” Communications of the ACM 38 (1995): 58–68.

62.使用深度强化学习学习玩各种视频游戏的 DQN 系统:Volodymyr Mnih 等人,“通过深度强化学习实现人类水平的控制”,自然》 518(2015):529–33。

62. The DQN system that learns to play a wide variety of video games using deep RL: Volodymyr Mnih et al., “Human-level control through deep reinforcement learning,” Nature 518 (2015): 529–33.

63.比尔·盖茨对 Dota 2 AI 的评论:凯瑟琳·克利福德,《比尔·盖茨称伊隆·马斯克支持的非营利组织的游戏机器人是 AI 领域的‘重要里程碑’》,CNBC,2018 年 6 月 28 日

63. Bill Gates’s remarks on Dota 2 AI: Catherine Clifford, “Bill Gates says gamer bots from Elon Musk-backed nonprofit are ‘huge milestone’ in A.I.,” CNBC, June 28, 2018.

64. OpenAI Five 在 Dota 2 中战胜人类世界冠军的记述:Kelsey Piper,《AI 在战略游戏 Dota 2 中战胜世界顶级职业战队》, Vox ,2019 年 413 日。

64. An account of OpenAI Five’s victory over the human world champions at Dota 2: Kelsey Piper, “AI triumphs against the world’s top pro team in strategy game Dota 2,” Vox, April 13, 2019.

65.文献中因奖励函数错误指定而导致意外行为的案例汇编:Victoria Krakovna,《AI 中的规范游戏示例》, Deep Safety(博客),2018 年 4 月 2 日。

65. A compendium of cases in the literature where misspecification of reward functions led to unexpected behavior: Victoria Krakovna, “Specification gaming examples in AI,” Deep Safety (blog), April 2, 2018.

66.以最大速度定义的进化适应度函数导致了非常出乎意料的结果的案例:Karl Sims,“进化的虚拟生物”,载于第 21 届计算机图形学和交互技术年会论文集(ACM,1994 年)。

66. A case where an evolutionary fitness function defined in terms of maximum velocity led to very unexpected results: Karl Sims, “Evolving virtual creatures,” in Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques (ACM, 1994).

67.有关反射代理可能性的精彩阐述,请参阅 Valentino Braitenberg 的载体:合成心理学实验》(麻省理工学院出版社,1984 年)。

67. For a fascinating exposition of the possibilities of reflex agents, see Valentino Braitenberg, Vehicles: Experiments in Synthetic Psychology (MIT Press, 1984).

68.关于自动驾驶汽车撞到行人导致死亡的事故的新闻文章:Devin Coldewey,《Uber 在致命事故中检测到行人但禁用了紧急制动》, TechCrunch ,2018 年 524 日。

68. News article on a fatal accident involving a vehicle in autonomous mode that hit a pedestrian: Devin Coldewey, “Uber in fatal crash detected pedestrian but had emergency braking disabled,” TechCrunch, May 24, 2018.

69.关于转向控制算法,请参见 Jarrod Snider 的“自动转向方法,用于自动汽车路径跟踪”,技术报告 CMU-RI-TR-09-08,卡内基梅隆大学机器人研究所,2009 年

69. On steering control algorithms, see, for example, Jarrod Snider, “Automatic steering methods for autonomous automobile path tracking,” technical report CMU-RI-TR-09-08, Robotics Institute, Carnegie Mellon University, 2009.

70 .诺福克梗和诺里奇梗在 ImageNet 数据库中分为两个类别。众所周知,它们很难区分,直到 1964 年才被视为一个品种。

70. Norfolk and Norwich terriers are two categories in the ImageNet database. They are notoriously hard to tell apart and were viewed as a single breed until 1964.

71.图像标记的一个非常不幸的事件:Daniel Howley,《Google Photos 错误地将两名美国黑人标记为大猩猩》,雅虎科技,2015 年 6 月 29 日。

71. A very unfortunate incident with image labeling: Daniel Howley, “Google Photos mislabels 2 black Americans as gorillas,” Yahoo Tech, June 29, 2015.

72.有关谷歌和大猩猩的后续文章:Tom Simonite,《说到大猩猩,谷歌照片仍然视而不见》 , 《连线》,2018 年 1 月 11 日。

72. Follow-up article on Google and gorillas: Tom Simonite, “When it comes to gorillas, Google Photos remains blind,” Wired, January 11, 2018.

第三章

CHAPTER 3

1.游戏算法的基本方案由克劳德·香农 (Claude Shannon) 提出,《编写计算机下棋程序》,《哲学杂志》第 7 辑,第 41 卷 (1950):第 256-75 页。

1. The basic plan for game-playing algorithms was laid out by Claude Shannon, “Programming a computer for playing chess,” Philosophical Magazine, 7th ser., 41 (1950): 256–75.

2.参见 Stuart Russell 和 Peter Norvig 合著的《人工智能:一种现代方法》第 1 版(Prentice Hall,1995 年)中的图 5.12。请注意,棋手和国际象棋程序等级并不是一门精确的科学。卡斯帕罗夫的最高 Elo 等级是 1999 年获得的 2851,但目前的国际象棋引擎(如 Stockfish)的等级为 3300 或更高。

2. See figure 5.12 of Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 1st ed. (Prentice Hall, 1995). Note that the rating of chess players and chess programs is not an exact science. Kasparov’s highest-ever Elo rating was 2851, achieved in 1999, but current chess engines such as Stockfish are rated at 3300 or more.

3.最早报道的在公共道路上行驶的自动驾驶汽车:Ernst Dickmanns 和 Alfred Zapp,“通过计算机视觉实现高速道路车辆自主导航”, IFAC 论文集第20 卷(1987 年):221-26。

3. The earliest reported autonomous vehicle on a public road: Ernst Dickmanns and Alfred Zapp, “Autonomous high speed road vehicle guidance by computer vision,” IFAC Proceedings Volumes 20 (1987): 221–26.

4.谷歌(后来的 Waymo)汽车的安全记录:“Waymo 安全报告:迈向完全自动驾驶”,2018 年

4. The safety record for Google (subsequently Waymo) vehicles: “Waymo safety report: On the road to fully self-driving,” 2018.

5.到目前为止,已有至少两名司机死亡,一名行人死亡。以下是一些参考资料,以及描述事件经过的简短引述。Danny Yadron 和 Dan Tynan,“特斯拉司机在使用自动驾驶模式时死于首起致命车祸”,《卫报》,2016 年 6 月 30 日:“Model S 上的自动驾驶传感器未能区分在明亮天空下穿过高速公路的白色牵引拖车。”Megan Rose Dickey,“美国国家运输安全委员会称,特斯拉 Model X 在致命车祸发生前几秒在自动驾驶模式下加速”, TechCrunch,2018 年 6 月 7 日:“在碰撞前 3 秒直到撞击衰减器时,特斯拉的速度从 62 英里/小时增加到 70.8 英里/小时,没有检测到预碰撞制动或规避转向动作。” Devin Coldewey,《Uber 在致命车祸中检测到行人但禁用了紧急制动》, TechCrunch,2018 年 5 月 24 日:“当车辆处于计算机控制之下时,不会启用紧急制动操作,以降低车辆出现不稳定行为的可能性。”

5. So far there have been at least two driver fatalities and one pedestrian fatality. Some references follow, along with brief quotes describing what happened. Danny Yadron and Dan Tynan, “Tesla driver dies in first fatal crash while using autopilot mode,” Guardian, June 30, 2016: “The autopilot sensors on the Model S failed to distinguish a white tractor-trailer crossing the highway against a bright sky.” Megan Rose Dickey, “Tesla Model X sped up in Autopilot mode seconds before fatal crash, according to NTSB,” TechCrunch, June 7, 2018: “At 3 seconds prior to the crash and up to the time of impact with the crash attenuator, the Tesla’s speed increased from 62 to 70.8 mph, with no precrash braking or evasive steering movement detected.” Devin Coldewey, “Uber in fatal crash detected pedestrian but had emergency braking disabled,” TechCrunch, May 24, 2018: “Emergency braking maneuvers are not enabled while the vehicle is under computer control, to reduce the potential for erratic vehicle behavior.”

6.美国汽车工程师协会 (SAE) 定义了六个自动化级别,其中 0 级为完全自动化,5 级为全自动:“自动驾驶系统在人类驾驶员可管理的所有道路和环境条件下,全天候执行动态驾驶任务的所有方面。”

6. The Society of Automotive Engineers (SAE) defines six levels of automation, where Level 0 is none at all and Level 5 is full automation: “The full-time performance by an automatic driving system of all aspects of the dynamic driving task under all roadway and environmental conditions that can be managed by a human driver.”

7.预测自动化对交通成本的经济效应:Adele Peters,“到 2030 年,乘坐电动机器人出租车可能比拥有汽车便宜 10 倍”, Fast Company,2017 年 5 月 30 日。

7. Forecast of economic effects of automation on transportation costs: Adele Peters, “It could be 10 times cheaper to take electric robo-taxis than to own a car by 2030,” Fast Company, May 30, 2017.

8.事故对自动驾驶汽车监管行动前景的影响:Richard Waters,《自动驾驶汽车死亡事件给监管者带来困境》,《金融时报》 ,2018年3月20日。

8. The impact of accidents on the prospects for regulatory action on autonomous vehicles: Richard Waters, “Self-driving car death poses dilemma for regulators,” Financial Times, March 20, 2018.

9.事故对公众对自动驾驶汽车看法的影响:考克斯汽车公司,“根据考克斯汽车公司移动性研究,自动驾驶汽车意识不断上升,接受度不断下降”,2018 年 8 月 16 日。

9. The impact of accidents on public perception of autonomous vehicles: Cox Automotive, “Autonomous vehicle awareness rising, acceptance declining, according to Cox Automotive mobility study,” August 16, 2018.

10.最初的聊天机器人:Joseph Weizenbaum,“ELIZA——用于研究人与机器之间自然语言交流的计算机程序”,ACM 通讯》 9(1966 年):36-45。

10. The original chatbot: Joseph Weizenbaum, “ELIZA—a computer program for the study of natural language communication between man and machine,” Communications of the ACM 9 (1966): 36–45.

11.请参阅 physiome.org ,了解生理建模的最新活动。20 世纪 60 年代的工作建立了包含数千个微分方程的模型:Arthur Guyton、Thomas Coleman 和 Harris Granger,《循环:整体调节》,《生理学年度评论》第 34 期(1972 年):13-44。

11. See physiome.org for current activities in physiological modeling. Work in the 1960s assembled models with thousands of differential equations: Arthur Guyton, Thomas Coleman, and Harris Granger, “Circulation: Overall regulation,” Annual Review of Physiology 34 (1972): 13–44.

12.斯坦福大学的 Pat Suppes 及其同事最早开展了有关辅导系统的一些研究:Patrick Suppes 和 Mona Morningstar,《计算机辅助教学》,《科学》 166(1969):343-50。

12. Some of the earliest work on tutoring systems was done by Pat Suppes and colleagues at Stanford: Patrick Suppes and Mona Morningstar, “Computer-assisted instruction,” Science 166 (1969): 343–50.

13. Michael Yudelson、Kenneth Koedinger 和 Geoffrey Gordon,《个性化贝叶斯知识追踪模型》,《教育中的人工智能:第 16 届国际会议》,H . Chad Lane 等人编辑(Springer,2013 年)。

13. Michael Yudelson, Kenneth Koedinger, and Geoffrey Gordon, “Individualized Bayesian knowledge tracing models,” in Artificial Intelligence in Education: 16th International Conference, ed. H. Chad Lane et al. (Springer, 2013).

14.有关加密数据机器学习的示例,请参阅 Reza Shokri 和 Vitaly Shmatikov 撰写的《保护隐私的深度学习》,载于22 届 ACM SIGSAC 计算机和通信安全会议论文集(ACM,2015 年)。

14. For an example of machine learning on encrypted data, see, for example, Reza Shokri and Vitaly Shmatikov, “Privacy-preserving deep learning,” in Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (ACM, 2015).

15.第一个智能家居的回顾,基于其发明者詹姆斯·萨瑟兰(James Sutherland)的演讲:James E. Tomayko ,《家用电子计算机(ECHO):第一台家用计算机》, IEEE 计算史年鉴16(1994 年):59–61。

15. A retrospective on the first smart home, based on a lecture by its inventor, James Sutherland: James E. Tomayko, “Electronic Computer for Home Operation (ECHO): The first home computer,” IEEE Annals of the History of Computing 16 (1994): 59–61.

16.基于机器学习和自动决策的智能家居项目摘要:Diane Cook 等人,《MavHome:基于代理的智能家居》,载于第 1 届 IEEE 普适计算和通信国际会议论文集(IEEE,2003 年)。

16. Summary of a smart-home project based on machine learning and automated decisions: Diane Cook et al., “MavHome: An agent-based smart home,” in Proceedings of the 1st IEEE International Conference on Pervasive Computing and Communications (IEEE, 2003).

17.有关智能家居用户体验分析的初步信息,请参阅 Scott Davidoff 等撰写的《智能家居控制原理》,载于《Ubicomp 2006:普适计算》,由 Paul Dourish 和 Adrian Friday 编辑(Springer,2006 年)

17. For the beginnings of an analysis of user experiences in smart homes, see Scott Davidoff et al., “Principles of smart home control,” in Ubicomp 2006: Ubiquitous Computing, ed. Paul Dourish and Adrian Friday (Springer, 2006).

18.基于人工智能的智能家居商业公告:“沃尔夫公司在加利福尼亚州圣罗莎的新 Annadel 公寓推出革命性的智能家居技术”,《商业内幕》 ,2018 年 312 日。

18. Commercial announcement of AI-based smart homes: “The Wolff Company unveils revolutionary smart home technology at new Annadel Apartments in Santa Rosa, California,” Business Insider, March 12, 2018.

19.有关机器人厨师作为商业产品的文章:Eustacia Huen,《世界上第一个家用机器人厨师可以烹饪超过 100 种饭菜》,《福布斯》 ,2016 年 1031 日。

19. Article on robot chefs as commercial products: Eustacia Huen, “The world’s first home robotic chef can cook over 100 meals,” Forbes, October 31, 2016.

20.我的伯克利同事关于深度强化学习在机器人运动控制方面的报告:Sergey Levine 等人,《深度视觉运动策略的端到端训练》,《机器学习研究杂志》 17(2016 年):1-40。

20. Report from my Berkeley colleagues on deep RL for robotic motor control: Sergey Levine et al., “End-to-end training of deep visuomotor policies,” Journal of Machine Learning Research 17 (2016): 1–40.

21.关于数十万仓库工人工作自动化的可能性:Tom Simonite,《抓取机器人竞相统治亚马逊仓库》,《连线,2017 年 7 月 26 日。

21. On the possibilities for automating the work of hundreds of thousands of warehouse workers: Tom Simonite, “Grasping robots compete to rule Amazon’s warehouses,” Wired, July 26, 2017.

22.我假设每页需要一台笔记本电脑的 CPU 分钟,或者大约 10 11 次操作。谷歌第三代张量处理单元每秒运行大约 10 17 次操作,这意味着它每秒可以读取一百万页,或者大约需要五小时才能读取八千万本两百页的书。

22. I’m assuming a generous one laptop-CPU minute per page, or about 1011 operations. A third-generation tensor processing unit from Google runs at about 1017 operations per second, meaning that it can read a million pages per second, or about five hours for eighty million two-hundred-page books.

23. 2003年关于全球所有渠道信息产量的研究:Peter Lyman 和 Hal Varian,“有多少信息?” sims.berkeley.edu/research/projects/how-much-info-2003

23. A 2003 study on the global volume of information production by all channels: Peter Lyman and Hal Varian, “How much information?” sims.berkeley.edu/research/projects/how-much-info-2003.

24.有关情报机构使用语音识别的详情,请参阅 Dan Froomkin 的《美国国家安全局如何将口语转换成可搜索的文本》, The Intercept,2015 年 5 月 5 日。

24. For details on the use of speech recognition by intelligence agencies, see Dan Froomkin, “How the NSA converts spoken words into searchable text,” The Intercept, May 5, 2015.

25.分析卫星图像是一项艰巨的任务:Mike Kim,《与世界银行一起从太空绘制贫困地图》,Medium.com,2017 年 1 月 4 日。Kim估计有 800 万人每天 24 小时工作,相当于每周工作 40 小时的 3000 多万人。我怀疑这在实践中被高估了,因为绝大多数图像在一天内的变化微不足道。另一方面,美国情报界雇佣了数万人坐在巨大的房间里盯着卫星图像,只是为了跟踪小区域内发生的事情;因此,100 万人可能差不多就是全世界的人口数量。

25. Analysis of visual imagery from satellites is an enormous task: Mike Kim, “Mapping poverty from space with the World Bank,” Medium.com, January 4, 2017. Kim estimates eight million people working 24/7, which converts to more than thirty million people working forty hours per week. I suspect this is an overestimate in practice, because the vast majority of the images would exhibit negligible change over the course of one day. On the other hand, the US intelligence community employs tens of thousands of people sitting in vast rooms staring at satellite images just to keep track of what’s happening in small regions of interest; so one million people is probably about right for the whole world.

26.基于实时卫星图像数据的全球观测站建设取得了实质性进展:David Jensen 和 Jillian Campbell,《数字地球:构建、资助和管理行星数据数字生态系统》,联合国环境科学-政策-商业论坛白皮书,2018 年。

26. There is substantial progress towards a global observatory based on real-time satellite image data: David Jensen and Jillian Campbell, “Digital earth: Building, financing and governing a digital ecosystem for planetary data,” white paper for the UN Science-Policy-Business Forum on the Environment, 2018.

27. Luke Muehlhauser 撰写了大量有关人工智能预测的文章,我很感谢他为我找到了以下引文的原始来源。请参阅 Luke Muehlhauser 的“我们应该从过去的人工智能预测中学到什么?开放慈善项目报告,2016 年。

27. Luke Muehlhauser has written extensively on AI predictions, and I am indebted to him for tracking down original sources for the quotations that follow. See Luke Muehlhauser, “What should we learn from past AI forecasts? Open Philanthropy Project report, 2016.

28.预测20年内出现与人类水平相当的人工智能:Herbert Simon,管理决策新科学》(Harper & Row, 1960)。

28. A forecast of the arrival of human-level AI within twenty years: Herbert Simon, The New Science of Management Decision (Harper & Row, 1960).

29.预测在一代人之内会出现与人类水平相当的人工智能:Marvin Minsky,计算:有限和无限的机器》(Prentice Hall,1967)。

29. A forecast of the arrival of human-level AI within a generation: Marvin Minsky, Computation: Finite and Infinite Machines (Prentice Hall, 1967).

30.约翰·麦卡锡 (John McCarthy) 预测人类级别的人工智能将在“5 到 500 年内”出现:Ian Shenker,《专家认为,未来我们将会拥有聪明的机器人》,《底特律自由新闻报》 ,1977 年 9 月 30 日。

30. John McCarthy’s forecast of the arrival of human-level AI within “five to 500 years”: Ian Shenker, “Brainy robots in our future, experts think,” Detroit Free Press, September 30, 1977.

31.有关人工智能研究人员对人类水平人工智能到来的估计的调查摘要,请参阅 aiimpacts.org。关于人类水平人工智能的调查结果的扩展讨论AI 是由 Katja Grace 等人提出的,“AI 何时会超越人类表现?来自 AI 专家的证据”,arXiv:1705.08807v3(2018 年)。

31. For a summary of surveys of AI researchers on their estimates for the arrival of human-level AI, see aiimpacts.org. An extended discussion of survey results on human-level AI is given by Katja Grace et al., “When will AI exceed human performance? Evidence from AI experts,” arXiv:1705.08807v3 (2018).

32.有关原始计算机能力与脑力对比的图表,请参阅 Ray Kurzweil 的《加速回报定律》,Kurzweilai.net,2001 年 3 月 7 日

32. For a chart mapping raw computer power against brain power, see Ray Kurzweil, “The law of accelerating returns,” Kurzweilai.net, March 7, 2001.

33.艾伦研究所的 Aristo 项目: allenai.org/aristo

33. The Allen Institute’s Project Aristo: allenai.org/aristo.

34.有关在四年级理解力和常识测试中取得良好表现所需知识的分析,请参阅 Peter Clark 等,“自动构建推理支持知识库”,载于《自动知识库构建研讨会论文集》( 2014 年), akbc.ws/ 2014

34. For an analysis of the knowledge required to perform well on fourth-grade tests of comprehension and common sense, see Peter Clark et al., “Automatic construction of inference-supporting knowledge bases,” in Proceedings of the Workshop on Automated Knowledge Base Construction (2014), akbc.ws/2014.

35. Tom Mitchell 等人在《永无止境的学习》一文中描述了机器阅读方面的 NELL 项目,论文标题为《永无止境的学习》,发表于《ACM 通讯》 61(2018 年):103-15。

35. The NELL project on machine reading is described by Tom Mitchell et al., “Never-ending learning,” Communications of the ACM 61 (2018): 103–15.

36.从文本中进行引导推理的想法源自 Sergey Brin 的《从万维网中提取模式和关系》,载于《万维网和数据库》由 Paolo Atzeni、Alberto Mendelzon 和 Giansalvatore Mecca 编辑(Springer,1998 年)。

36. The idea of bootstrapping inferences from text is due to Sergey Brin, “Extracting patterns and relations from the World Wide Web,” in The World Wide Web and Databases, ed. Paolo Atzeni, Alberto Mendelzon, and Giansalvatore Mecca (Springer, 1998).

37.有关 LIGO 探测到的黑洞碰撞的可视化图像,请参阅加州理工学院 LIGO 实验室的“碰撞黑洞周围的扭曲空间和时间”,2016 年 2 月 11 日, youtube.com /watch? v=1agm33iEAuo 。

37. For a visualization of the black-hole collision detected by LIGO, see LIGO Lab Caltech, “Warped space and time around colliding black holes,” February 11, 2016, youtube.com/watch?v=1agm33iEAuo.

38.第一篇描述引力波观测的出版物:Ben Abbott 等人,“双黑洞合并产生的引力波观测”,物理评论快报116(2016):061102。

38. The first publication describing observation of gravitational waves: Ben Abbott et al., “Observation of gravitational waves from a binary black hole merger,” Physical Review Letters 116 (2016): 061102.

39.论婴儿作为科学家:Alison Gopnik、Andrew Meltzoff 和 Patricia Kuhl,《婴儿床上的科学家:思想、大脑以及儿童如何学习》 William Morrow,1999 年)。

39. On babies as scientists: Alison Gopnik, Andrew Meltzoff, and Patricia Kuhl, The Scientist in the Crib: Minds, Brains, and How Children Learn (William Morrow, 1999).

40.对通过对实验数据进行自动化科学分析来发现规律的几个项目的总结:Patrick Langley 等,《科学发现:创造过程的计算探索》麻省理工学院出版社,1987 年)。

40. A summary of several projects on automated scientific analysis of experimental data to discover laws: Patrick Langley et al., Scientific Discovery: Computational Explorations of the Creative Processes (MIT Press, 1987).

41.一些早期的以先验知识为指导的机器学习研究:Stuart Russell, 《类比和归纳中知识的运用》(Pitman,1989)。

41. Some early work on machine learning guided by prior knowledge: Stuart Russell, The Use of Knowledge in Analogy and Induction (Pitman, 1989).

42.古德曼对归纳法的哲学分析至今仍是我们的灵感源泉:纳尔逊·古德曼,事实、虚构和预测》(伦敦大学出版社,1954年)。

42. Goodman’s philosophical analysis of induction remains a source of inspiration: Nelson Goodman, Fact, Fiction, and Forecast (University of London Press, 1954).

43.一位资深人工智能研究人员抱怨科学哲学中的神秘主义:Herbert Simon,《解释不可言喻的事物:关于直觉、洞察力和灵感的人工智能》,载于第十四届国际人工智能大会论文集》,Chris Mellish 主编(Morgan Kaufmann,1995 年)。

43. A veteran AI researcher complains about mysticism in the philosophy of science: Herbert Simon, “Explaining the ineffable: AI on the topics of intuition, insight and inspiration,” in Proceedings of the 14th International Conference on Artificial Intelligence, ed. Chris Mellish (Morgan Kaufmann, 1995).

44.该领域的两位创始人 Stephen Muggleton 和 Luc de Raedt 对归纳逻辑程序设计进行了调查,《归纳逻辑程序设计:理论和方法》, 《逻辑程序设计杂志》 19-20(1994 年):629-79。

44. A survey of inductive logic programming by two originators of the field: Stephen Muggleton and Luc de Raedt, “Inductive logic programming: Theory and methods,” Journal of Logic Programming 19–20 (1994): 629–79.

45.有关将复杂运算封装为新的原始动作的重要性的早期提及,请参阅 Alfred North Whitehead 的《数学导论》(Henry Holt,1911 年)。

45. For an early mention of the importance of encapsulating complex operations as new primitive actions, see Alfred North Whitehead, An Introduction to Mathematics (Henry Holt, 1911).

46.演示模拟机器人可以完全自行学习站立的工作:John Schulman 等人,“使用广义优势估计的高维连续控制”,arXiv:1506.02438 (2015)。视频演示可在youtube.com/watch?v=SHLuf2ZBQSw找到。

46. Work demonstrating that a simulated robot can learn entirely by itself to stand up: John Schulman et al., “High-dimensional continuous control using generalized advantage estimation,” arXiv:1506.02438 (2015). A video demonstration is available at youtube.com/watch?v=SHLuf2ZBQSw.

47.关于学习玩夺旗类视频游戏的强化学习系统的描述:Max Jaderberg 等人,“通过基于人群的深度强化学习在第一人称多人游戏中达到人类水平的表现”,arXiv:1807.01281 (2018)。

47. A description of a reinforcement learning system that learns to play a capture-the-flag video game: Max Jaderberg et al., “Human-level performance in first-person multiplayer games with population-based deep reinforcement learning,” arXiv:1807.01281 (2018).

48.对未来几年人工智能进展的看法:Peter Stone 等,《2030 年的人工智能与生活》,《人工智能百年研究》 ,2015 年研究小组报告,2016 年。

48. A view of AI progress over the next few years: Peter Stone et al., “Artificial intelligence and life in 2030,” One Hundred Year Study on Artificial Intelligence, report of the 2015 Study Panel, 2016.

49.媒体引发的伊隆·马斯克和马克·扎克伯格之间的争论:彼得·霍利,《亿万富翁烧钱:马斯克称扎克伯格对人工智能威胁的理解‘有限’》, 华盛顿邮报》 ,2017 年 7 月 25 日。

49. The media-fueled argument between Elon Musk and Mark Zuckerberg: Peter Holley, “Billionaire burn: Musk says Zuckerberg’s understanding of AI threat ‘is limited,’” The Washington Post, July 25, 2017.

50.关于搜索引擎对个人用户的价值:Erik Brynjolfsson、Felix Eggers 和 Avinash Gannamaneni,《使用大规模在线选择实验来衡量幸福感的变化》,工作论文编号 24514,美国国家经济研究局,2018 年。

50. On the value of search engines to individual users: Erik Brynjolfsson, Felix Eggers, and Avinash Gannamaneni, “Using massive online choice experiments to measure changes in well-being,” working paper no. 24514, National Bureau of Economic Research, 2018.

51.青霉素被发现过几次,其治疗功效在医学出版物中也有描述,但似乎没有人注意到。请参阅en.wikipedia.org/wiki/History_of_penicillin

51. Penicillin was discovered several times and its curative powers were described in medical publications, but no one seems to have noticed. See en.wikipedia.org/wiki/History_of_penicillin.

52.有关无所不知、通灵的人工智能系统带来的一些更深奥的风险的讨论,请参阅 David Auerbach 的《有史以来最可怕的思想实验》, Slate,2014 年 7 月 17 日。

52. For a discussion of some of the more esoteric risks from omniscient, clairvoyant AI systems, see David Auerbach, “The most terrifying thought experiment of all time,” Slate, July 17, 2014.

53.对思考高级人工智能时一些潜在陷阱的分析:Kevin Kelly,《超人人工智能的神话》,《连线》 ,2017年4月25日。

53. An analysis of some potential pitfalls in thinking about advanced AI: Kevin Kelly, “The myth of a superhuman AI,” Wired, April 25, 2017.

54.机器可能与人类共享某些认知结构,特别是那些处理对物理世界的感知和操纵以及自然语言理解所涉及的概念结构的方面。由于硬件方面的巨大差异,它们的审议过程可能完全不同

54. Machines may share some aspects of cognitive structure with humans, particularly those aspects dealing with perception and manipulation of the physical world and the conceptual structures involved in natural language understanding. Their deliberative processes are likely to be quite different because of the enormous disparities in hardware.

55.根据 2016 年调查数据,第 88 个百分位数对应每年 100,000 美元:美国社区调查,美国人口普查局, www.census.gov / programs-surveys/acs。同年,全球人均 GDP 为 10,133 美元:国家账户主要总量数据库,联合国统计司, unstats.un.org/unsd/ snaama 。

55. According to 2016 survey data, the eighty-eighth percentile corresponds to $100,000 per year: American Community Survey, US Census Bureau, www.census.gov/programs-surveys/acs. For the same year, global per capita GDP was $10,133: National Accounts Main Aggregates Database, UN Statistics Division, unstats.un.org/unsd/snaama.

56.如果 GDP 增长阶段分 10 年或 20 年进行,其价值将分别达到 9,400 万亿美元或 6,800 万亿美元——仍然不容小觑。历史上一个有趣的现象是,推广智能爆炸概念的 IJ Good(本页估计,人类水平的人工智能的价值至少为“一个百万凯恩斯”,他指的是传说中的经济学家约翰·梅纳德·凯恩斯。1963 年,凯恩斯的贡献价值估计为 1000 亿英镑,因此一个百万凯恩斯价值以 2016 年的美元计算约为 2,200,000 万亿美元。Good 将人工智能的价值主要归因于其确保人类无限期生存的潜力。后来,他开始怀疑自己是否应该加一个减号。

56. If the GDP growth phases in over ten years or twenty years, it’s worth $9,400 trillion or $6,800 trillion, respectively—still nothing to sneeze at. On an interesting historical note, I. J. Good, who popularized the notion of an intelligence explosion (this page), estimated the value of human-level AI to be at least “one megaKeynes,” referring to the fabled economist John Maynard Keynes. The value of Keynes’s contributions was estimated in 1963 as £100 billion, so a megaKeynes comes out to around $2,200,000 trillion in 2016 dollars. Good pinned the value of AI primarily on its potential to ensure that the human race survives indefinitely. Later, he came to wonder whether he should have added a minus sign.

57.欧盟宣布计划在 2019-20 年期间投入 240 亿美元用于研发。参见欧盟委员会,《人工智能:委员会概述了促进投资和制定道德准则的欧洲方法》,新闻稿,2018 年 4 月 25 日。中国于 2017 年宣布的人工智能长期投资计划预计到 2030 年,核心人工智能产业每年将创造 1500 亿美元的收入。例如,参见 Paul Mozur,《北京希望到 2030 年人工智能在中国实现制造》,《纽约时报》,2017 年 7 月 20 日。

57. The EU announced plans for $24 billion in research and development spending for the period 2019–20. See European Commission, “Artificial intelligence: Commission outlines a European approach to boost investment and set ethical guidelines,” press release, April 25, 2018. China’s long-term investment plan for AI, announced in 2017, envisages a core AI industry generating $150 billion annually by 2030. See, for example, Paul Mozur, “Beijing wants A.I. to be made in China by 2030,” The New York Times, July 20, 2017.

58.例如,请参阅力拓集团的未来矿山计划,网址riotinto.com/australia/pilbara/mine-of-the-future-9603.aspx

58. See, for example, Rio Tinto’s Mine of the Future program at riotinto.com/australia/pilbara/mine-of-the-future-9603.aspx.

59.经济增长的回顾分析:Jan Luiten van Zanden 等编,《生活是怎样的? 1820年以来的全球福祉》(OECD Publishing,2014 年)。

59. A retrospective analysis of economic growth: Jan Luiten van Zanden et al., eds., How Was Life? Global Well-Being since 1820 (OECD Publishing, 2014).

60.对相对于他人的相对优势的渴望,而非对绝对生活质量的渴望,是一种地位商品;参见第 9 章。

60. The desire for relative advantage over others, rather than an absolute quality of life, is a positional good; see Chapter 9.

第四章

CHAPTER 4

1.维基百科有关史塔西的文章对史塔西的员工队伍及其对东德生活的整体影响提供了多处有用的参考

1. Wikipedia’s article on the Stasi has several useful references on its workforce and its overall impact on East German life.

2.有关史塔西档案的详细信息,请参阅 Cullen Murphy 的《上帝的陪审团:宗教裁判所和现代世界的形成》(霍顿·米夫林·哈考特出版社,2012 年)。

2. For details on Stasi files, see Cullen Murphy, God’s Jury: The Inquisition and the Making of the Modern World (Houghton Mifflin Harcourt, 2012).

3.有关人工智能监控系统的深入分析,请参阅 Jay Stanley 的《机器人监控的曙光》(美国公民自由联盟,2019 年)。

3. For a thorough analysis of AI surveillance systems, see Jay Stanley, The Dawn of Robot Surveillance (American Civil Liberties Union, 2019).

4.最近关于监视和控制的书籍包括 Shoshana Zuboff 的《监视资本主义时代:在权力新前沿为人类未来而战》(PublicAffairs,2019 年)和 Roger McNamee 的《Zucked:醒来面对 Facebook 灾难》(企鹅出版社,2019 年)。

4. Recent books on surveillance and control include Shoshana Zuboff, The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power (PublicAffairs, 2019) and Roger McNamee, Zucked: Waking Up to the Facebook Catastrophe (Penguin Press, 2019).

5.有关勒索机器人的新闻文章:Avivah Litan,《认识 Delilah — 第一个内部威胁木马》,Gartner 博客网络,2016 年 7 月 14 日

5. News article on a blackmail bot: Avivah Litan, “Meet Delilah—the first insider threat Trojan,” Gartner Blog Network, July 14, 2016.

6.人类对错误信息的敏感性的低技术版本,其中不知情的人相信世界将被流星摧毁罢工,请参见《德伦布朗: 启示录》,“第一部分”,导演为西蒙·丁塞尔,2012 年,youtube.com/watch? v=o_CUrMJOxqs 。

6. For a low-tech version of human susceptibility to misinformation, in which an unsuspecting individual becomes convinced that the world is being destroyed by meteor strikes, see Derren Brown: Apocalypse, “Part One,” directed by Simon Dinsell, 2012, youtube.com/watch?v=o_CUrMJOxqs.

7. Steven Tadelis 在《在线平台市场中的声誉和反馈系统》中对声誉系统及其腐败进行了经济分析,《年度经济学评论》第 8 卷(2016 年):第 321-40 页。

7. An economic analysis of reputation systems and their corruption is given by Steven Tadelis, “Reputation and feedback systems in online platform markets,” Annual Review of Economics 8 (2016): 321–40.

8.古德哈特定律:“任何观察到的统计规律一旦受到控制压力就会趋于崩溃。”例如,教师质量和教师工资之间可能曾经存在相关性,因此《美国新闻与世界报道》的大学排名以教师工资衡量教师质量。这导致了一场工资军备竞赛,受益的是教师,而不是支付这些工资的学生。军备竞赛改变教师工资的方式与教师质量无关,因此相关性趋于消失。

8. Goodhart’s law: “Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.” For example, there may once have been a correlation between faculty quality and faculty salary, so the US News & World Report college rankings measure faculty quality by faculty salaries. This has contributed to a salary arms race that benefits faculty members but not the students who pay for those salaries. The arms race changes faculty salaries in a way that does not depend on faculty quality, so the correlation tends to disappear.

9.一篇描述德国监管公共言论努力的文章:Bernhard Rohleder,《德国着手删除网上仇恨言论。相反,这让事情变得更糟》,世界邮报》,2018 年 2 月 20 日。

9. An article describing German efforts to police public discourse: Bernhard Rohleder, “Germany set out to delete hate speech online. Instead, it made things worse,” WorldPost, February 20, 2018.

10.关于“信息末日”:阿维夫·奥瓦迪亚,《什么比假新闻更糟糕?对现实本身的扭曲》,《世界邮报》,2018 年 2 月 22 日。

10. On the “infopocalypse”: Aviv Ovadya, “What’s worse than fake news? The distortion of reality itself,” WorldPost, February 22, 2018.

11.论网上酒店评论的腐败:Dina Mayzlin、Yaniv Dover 和 Judith Chevalier,“促销评论:对网上评论操纵的实证调查”,美国经济评论104(2014 年):2421-55。

11. On the corruption of online hotel reviews: Dina Mayzlin, Yaniv Dover, and Judith Chevalier, “Promotional reviews: An empirical investigation of online review manipulation,” American Economic Review 104 (2014): 2421–55.

12.德国在2018年4月10日日内瓦《特定常规武器公约》政府专家小组会议上的声明

12. Statement of Germany at the Meeting of the Group of Governmental Experts, Convention on Certain Conventional Weapons, Geneva, April 10, 2018.

13.由未来生命研究所资助的《屠杀机器人》电影于 2017 年 11 月上映,可youtube.com/watch?v=9CO6M2HsoIA观看

13. The Slaughterbots movie, funded by the Future of Life Institute, appeared in November 2017 and is available at youtube.com/watch?v=9CO6M2HsoIA.

14.有关军事公关中一个较大失误报道,请参阅 Dan Lamothe 的《五角大楼机构希望无人机像狼一样成群捕猎》 《华盛顿邮报》 ,2015 年 1 月 23 日。

14. For a report on one of the bigger faux pas in military public relations, see Dan Lamothe, “Pentagon agency wants drones to hunt in packs, like wolves,” The Washington Post, January 23, 2015.

15.宣布大规模无人机群实验:美国国防部,“国防部宣布微型无人机演示成功”,新闻稿编号NR-008-17,2017年1月9日

15. Announcement of a large-scale drone swarm experiment: US Department of Defense, “Department of Defense announces successful micro-drone demonstration,” news release no. NR-008-17, January 9, 2017.

16.研究技术对就业影响的研究中心包括伯克利大学的工作和智能工具与系统小组、斯坦福大学行为科学高级研究中心的未来工作和工人项目以及卡内基梅隆大学的未来工作计划

16. Examples of research centers studying the impact of technology on employment are the Work and Intelligent Tools and Systems group at Berkeley, the Future of Work and Workers project at the Center for Advanced Study in the Behavioral Sciences at Stanford, and the Future of Work Initiative at Carnegie Mellon University.

17.对未来技术性失业的悲观看法:马丁·福特, 《机器人的崛起:技术与未来失业的威胁》(Basic Books,2015 年)。

17. A pessimistic take on future technological unemployment: Martin Ford, Rise of the Robots: Technology and the Threat of a Jobless Future (Basic Books, 2015).

18. Calum Chace,《经济奇点:人工智能与资本主义的消亡》 Three Cs,2016 年)。

18. Calum Chace, The Economic Singularity: Artificial Intelligence and the Death of Capitalism (Three Cs, 2016).

19.有关优秀论文集,请参阅 Ajay Agrawal、Joshua Gans 和 Avi Goldfarb 编辑的《人工智能经济学:议程》(美国国家经济研究局,2019 年)。

19. For an excellent collection of essays, see Ajay Agrawal, Joshua Gans, and Avi Goldfarb, eds., The Economics of Artificial Intelligence: An Agenda (National Bureau of Economic Research, 2019).

20.这一“倒 U 型”就业曲线背后的数学分析由 James Bessen 在《人工智能与就业:需求的作用》一文中给出,收录于Agrawal、Gans 和 Goldfarb 主编的《人工智能经济学》中。

20. The mathematical analysis behind this “inverted-U” employment curve is given by James Bessen, “Artificial intelligence and jobs: The role of demand” in The Economics of Artificial Intelligence, ed. Agrawal, Gans, and Goldfarb.

21.有关自动化引起的经济混乱的讨论,请参阅 Eduardo Porter 的《科技正在将美国劳动力一分为二》,《纽约时报》,2019 年 24 日。文章引用了以下报告得出这一结论:David Autor 和 Anna Salomons 的《自动化是否会取代劳动力?生产力增长、就业和劳动力份额》,《布鲁金斯经济活动论文》 (2018 年)。

21. For a discussion of economic dislocation arising from automation, see Eduardo Porter, “Tech is splitting the US work force in two,” The New York Times, February 4, 2019. The article cites the following report for this conclusion: David Autor and Anna Salomons, “Is automation labor-displacing? Productivity growth, employment, and the labor share,” Brookings Papers on Economic Activity (2018).

22.有关二十世纪银行业增长的数据,请参阅 Thomas Philippon,“1860 年至 2007 年美国金融业的发展:理论与证据”,工作论文,2008 年

22. For data on the growth of banking in the twentieth century, see Thomas Philippon, “The evolution of the US financial industry from 1860 to 2007: Theory and evidence,” working paper, 2008.

23.就业数据和职业兴衰的圣经:美国劳工统计局, 《职业展望手册:2018-2019 年版》(伯南出版社,2018 年)。

23. The bible for jobs data and the growth and decline of occupations: US Bureau of Labor Statistics, Occupational Outlook Handbook: 2018–2019 Edition (Bernan Press, 2018).

24.关于卡车自动化的报道:Lora Kolodny,“亚马逊正在使用 Embark 开发的自动驾驶卡车运送货物”,CNBC,2019 年 1 月 30 日

24. A report on trucking automation: Lora Kolodny, “Amazon is hauling cargo in self-driving trucks developed by Embark,” CNBC, January 30, 2019.

25.法律分析自动化的进展,描述竞赛结果:Jason Tashea,“AI 软件在评估保密协议时比律师更准确、更快”, ABA Journal ,2018 年 226 日。

25. The progress of automation in legal analytics, describing the results of a contest: Jason Tashea, “AI software is more accurate, faster than attorneys when assessing NDAs,” ABA Journal, February 26, 2018.

26 .一位杰出经济学家的评论,其标题明确让人想起凯恩斯 1930 年的文章:劳伦斯·萨默斯,《我们孩子的经济可能性》, NBER 记者(2013 年)。

26. A commentary by a distinguished economist, with a title explicitly evoking Keynes’s 1930 article: Lawrence Summers, “Economic possibilities for our children,” NBER Reporter (2013).

27.数据科学就业与巨型游轮上的小型救生艇之间的类比来自与新加坡公共服务部负责人杨英仪的讨论。她承认,在全球范围内,这种类比是正确的,但指出“新加坡足够小,可以容纳救生艇。”

27. The analogy between data science employment and a small lifeboat for a giant cruise ship comes from a discussion with Yong Ying-I, head of Singapore’s Public Service Division. She conceded that it was correct on the global scale, but noted that “Singapore is small enough to fit in the lifeboat.”

28.保守派对全民基本收入的支持:Sam Bowman,《理想的福利制度是基本收入》,亚当斯密研究所,2013年11月25日

28. Support for UBI from a conservative viewpoint: Sam Bowman, “The ideal welfare system is a basic income,” Adam Smith Institute, November 25, 2013.

29.从进步观点支持全民基本收入:Jonathan Bartley,《绿党支持全民基本收入,其他国家需效仿》,《卫报》 ,2017 年 62 日。

29. Support for UBI from a progressive viewpoint: Jonathan Bartley, “The Greens endorse a universal basic income. Others need to follow,” The Guardian, June 2, 2017.

30. Chace 在经济奇点》一书中将 UBI 的“天堂”版本称为星际迷航经济,并指出在最近的《星际迷航》系列剧中金钱已被废除,因为技术创造了几乎无限的物质商品和能源。他还指出,要使这种系统取得成功,经济和社会组织必须进行大规模变革。

30. Chace, in The Economic Singularity, calls the “paradise” version of UBI the Star Trek economy, noting that in the more recent series of Star Trek episodes, money has been abolished because technology has created essentially unlimited material goods and energy. He also points to the massive changes in economic and social organization that will be needed to make such a system successful.

31.经济学家理查德·鲍德温也在其著作《全球化、机器人技术和工作的未来》牛津大学出版社,2019 年)中预测了个人服务的未来。

31. The economist Richard Baldwin also predicts a future of personal services in his book The Globotics Upheaval: Globalization, Robotics, and the Future of Work (Oxford University Press, 2019).

32该书被认为揭露了“全词”读写教育的失败,并引发了阅读方面两大主要思想流派之间数十年的斗争:鲁道夫·弗莱施 (Rudolf Flesch),《约翰尼为什么不会阅读:以及你能做些什么》 (Harper & Bros.,1955 年)。

32. The book that is viewed as having exposed the failure of “whole-word” literacy education and launched decades of struggle between the two main schools of thought on reading: Rudolf Flesch, Why Johnny Can’t Read: And What You Can Do about It (Harper & Bros., 1955).

33.关于使接受者能够适应未来几十年快速的技术和经济变化的教育方法:Joseph Aoun,机器人防护:人工智能时代的高等教育》(麻省理工学院出版社,2017 年)。

33. On educational methods that enable the recipient to adapt to the rapid rate of technological and economic change in the next few decades: Joseph Aoun, Robot-Proof: Higher Education in the Age of Artificial Intelligence (MIT Press, 2017).

34.图灵在一次广播演讲中预测人类将被机器取代:阿兰·图灵,《数字机器能思考吗?》,1951 年 5 月 15 日,英国广播公司第三节目广播。打字稿可在 turingarchive.org 上找到

34. A radio lecture in which Turing predicted that humans would be overtaken by machines: Alan Turing, “Can digital machines think?,” May 15, 1951, radio broadcast, BBC Third Programme. Typescript available at turingarchive.org.

35.描述索菲亚“归化”为沙特阿拉伯公民的新闻文章:Dave Gershgorn,《走进世界上第一个机器人公民的机械大脑》, Quartz 2017 年 11 月 12 日。

35. News article describing the “naturalization” of Sophia as a citizen of Saudi Arabia: Dave Gershgorn, “Inside the mechanical brain of the world’s first robot citizen,” Quartz, November 12, 2017.

36.关于 Yann LeCun 对索菲亚的看法: Shona Ghosh,《Facebook 的 AI 老板将机器人索菲亚描述为‘彻头彻尾的混蛋’和‘绿野仙踪中的 AI’》《商业内幕》,2018 年 1 月 6 日。

36. On Yann LeCun’s view of Sophia: Shona Ghosh, “Facebook’s AI boss described Sophia the robot as ‘complete b——t’ and ‘Wizard-of-Oz AI,’” Business Insider, January 6, 2018.

37.欧盟关于机器人合法权利的提案:欧洲议会法律事务委员会,《向委员会提交的关于机器人民法规则的建议报告(2015/2103(INL))》,2017年。

37. An EU proposal on legal rights for robots: Committee on Legal Affairs of the European Parliament, “Report with recommendations to the Commission on Civil Law Rules on Robotics (2015/2103(INL)),” 2017.

38.事实上, GDPR 关于“解释权”的规定并不是新的:它与它取代的 1995 年数据保护指令第 15(1) 条非常相似。

38. The GDPR provision on a “right to an explanation” is not, in fact, new: it is very similar to Article 15(1) of the 1995 Data Protection Directive, which it supersedes.

39以下是最近发表的三篇论文,它们对公平性进行了深刻的数学分析:Moritz Hardt、Eric Price 和 Nati Srebro,“监督学习中的机会平等”,载于《神经信息处理系统进展》29,Daniel Lee 等人编辑(2016 年);Matt Kusner 等人,“反事实公平性”,载于《神经信息处理系统进展》30,Isabelle Guyon 等人编辑(2017 年);Jon Kleinberg、Sendhil Mullainathan 和 Manish Raghavan,“公平确定风险评分的固有权衡”,载于《第 8 届理论计算机科学创新会议》,Christos Papadimitriou 编辑(Dagstuhl Publishing,2017 年)。

39. Here are three recent papers providing insightful mathematical analyses of fairness: Moritz Hardt, Eric Price, and Nati Srebro, “Equality of opportunity in supervised learning,” in Advances in Neural Information Processing Systems 29, ed. Daniel Lee et al. (2016); Matt Kusner et al., “Counterfactual fairness,” in Advances in Neural Information Processing Systems 30, ed. Isabelle Guyon et al. (2017); Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan, “Inherent trade-offs in the fair determination of risk scores,” in 8th Innovations in Theoretical Computer Science Conference, ed. Christos Papadimitriou (Dagstuhl Publishing, 2017).

40.描述空中交通管制软件故障后果的新闻文章:西蒙·考尔德,《欧洲空中交通协调中心系统故障导致航班取消,数千人滞留》,《独立报》2018年 4 月 3 日。

40. News article describing the consequences of software failure for air traffic control: Simon Calder, “Thousands stranded by flight cancellations after systems failure at Europe’s air-traffic coordinator,” The Independent, April 3, 2018.

第五章

CHAPTER 5

1.洛夫莱斯写道:“分析引擎不具备创造任何东西的本领。只要我们知道命令它做什么,它就能做什么。它可以遵循分析,但它没有预测任何分析关系或真理的能力。”这是艾伦·图灵驳斥的反对人工智能的论点之一,《计算机器和智能》, Mind 59(1950):433-60。

1. Lovelace wrote, “The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform. It can follow analysis; but it has no power of anticipating any analytical relations or truths.” This was one of the arguments against AI that was refuted by Alan Turing, “Computing machinery and intelligence,” Mind 59 (1950): 433–60.

2.已知最早有关人工智能生存风险的文章是理查德·桑顿(Richard Thornton)撰写的《机械时代》,《原始阐释者》 IV(1847年):281页。

2. The earliest known article on existential risk from AI was by Richard Thornton, “The age of machinery,” Primitive Expounder IV (1847): 281.

3.《机器之书基于塞缪尔·巴特勒的早期文章《机器中的达尔文》,新闻报》(新西兰基督城),1863 年 6 月 13 日。

3. “The Book of the Machines was based on an earlier article by Samuel Butler, “Darwin among the machines,” The Press (Christchurch, New Zealand), June 13, 1863.

4.图灵在另一篇演讲中预言了人类的灭亡:艾伦·图灵,《智能机器,一种异端理论》(1951 年在曼彻斯特 51 学会发表的演讲)。打字稿可在 turingarchive.org 上找到

4. Another lecture in which Turing predicted the subjugation of humankind: Alan Turing, “Intelligent machinery, a heretical theory” (lecture given to the 51 Society, Manchester, 1951). Typescript available at turingarchive.org.

5.维纳关于技术控制人类的先见之明的讨论以及对保留人类自主权的呼吁:诺伯特·维纳,人的人的用处》(河滨出版社,1950 年)。

5. Wiener’s prescient discussion of technological control over humanity and a plea to retain human autonomy: Norbert Wiener, The Human Use of Human Beings (Riverside Press, 1950).

6.维纳 1950 年出版的书的封面简介与未来生命研究所的座右铭非常相似,该研究所是一个致力于研究人类面临的生存风险的组织:“技术赋予生命前所未有的繁荣潜力……或者自我毁灭。

6. The front-cover blurb from Wiener’s 1950 book is remarkably similar to the motto of the Future of Life Institute, an organization dedicated to studying the existential risks that humanity faces: “Technology is giving life the potential to flourish like never before . . . or to self-destruct.”

7.维纳对智能机器可能性的认识不断提高,他的观点也随之更新:诺伯特·维纳,《上帝与魔像公司:关于控制论对宗教影响的某些观点的评论》(麻省理工学院出版社,1964 年)。

7. An updating of Wiener’s views arising from his increased appreciation of the possibility of intelligent machines: Norbert Wiener, God and Golem, Inc.: A Comment on Certain Points Where Cybernetics Impinges on Religion (MIT Press, 1964).

8.阿西莫夫的机器人三定律最早出现在艾萨克·阿西莫夫的《Runaround》一书中,《惊人的科幻小说》1942年3月。定律如下:

8. Asimov’s Three Laws of Robotics first appeared in Isaac Asimov, “Runaround,” Astounding Science Fiction, March 1942. The laws are as follows:

  1. 机器人不得伤害人类,或因不作为而让人类受到伤害。

  2. A robot may not injure a human being or, through inaction, allow a human being to come to harm.

  3. 机器人必须服从人类的命令,除非这些命令与第一定律相冲突。

  4. A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.

  5. 机器人必须保护自己的存在,只要这种保护不违反第一定律或第二定律。

  6. A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.

重要的是要明白,阿西莫夫提出这些定律是为了创作有趣的故事情节,而不是作为未来机器人专家的严肃指南。他的几篇故事,包括《Runaround》,都说明了从字面上理解这些定律会带来的问题后果。从现代人工智能的角度来看,这些定律没有考虑到任何概率和风险因素:因此,机器人的行为是否合法,这些行为会给人类带来一定程度的伤害——无论这种伤害有多小——尚不明确。

It is important to understand that Asimov proposed these laws as a way to generate interesting story plots, not as a serious guide for future roboticists. Several of his stories, including “Runaround,” illustrate the problematic consequences of taking the laws literally. From the standpoint of modern AI, the laws fail to acknowledge any element of probability and risk: the legality of robot actions that expose a human to some probability of harm—however infinitesimal—is therefore unclear.

9.工具性目标的概念源自 Stephen Omohundro 的《自我完善人工智能的本质》(未发表手稿,2008 年)。另请参阅 Stephen Omohundro 的《人工智能的基本驱动力》,《 2008年通用人工智能:第一届 AGI 会议论文集》,Pei Wang、Ben Goertzel 和 Stan Franklin 主编(IOS Press,2008 年)。

9. The notion of instrumental goals is due to Stephen Omohundro, “The nature of self-improving artificial intelligence” (unpublished manuscript, 2008). See also Stephen Omohundro, “The basic AI drives,” in Artificial General Intelligence 2008: Proceedings of the First AGI Conference, ed. Pei Wang, Ben Goertzel, and Stan Franklin (IOS Press, 2008).

10 .约翰尼·德普饰演的威尔·卡斯特的目标似乎是解决肉体转世的问题,以便与妻子伊芙琳团聚。这恰恰表明,总体目标的性质并不重要——工具性目标都是一样的。

10. The objective of Johnny Depp’s character, Will Caster, seems to be to solve the problem of physical reincarnation so that he can be reunited with his wife, Evelyn. This just goes to show that the nature of the overarching objective doesn’t matter—the instrumental goals are all the same.

11.智能爆炸概念的原始来源:IJ Good,《关于第一台超智能机器的推测》, 《计算机发展》第 6 卷,Franz Alt 和 Morris Rubinoff 编辑(Academic Press,1965 年)。

11. The original source for the idea of an intelligence explosion: I. J. Good, “Speculations concerning the first ultraintelligent machine,” in Advances in Computers, vol. 6, ed. Franz Alt and Morris Rubinoff (Academic Press, 1965).

12.智力爆炸理念影响的一个例子:Luke Muehlhauser 在《面对智力爆炸》 (intelligenceexplosion.com)一书中写道:“Good段落像一列火车一样从我身上碾过。”

12. An example of the impact of the intelligence explosion idea: Luke Muehlhauser, in Facing the Intelligence Explosion (intelligenceexplosion.com), writes, “Good’s paragraph ran over me like a train.”

13.收益递减可以用以下方式说明:假设智能提高 16% ,机器的智能水平就会提高 8%,而这又会提高 4%,以此类推。这个过程在比原有水平高出约 36% 时达到极限。有关这些问题的更多讨论,请参阅 Eliezer Yudkowsky 的《智能爆炸微观经济学》,技术报告 2013-1,机器智能研究所,2013 年。

13. Diminishing returns can be illustrated as follows: suppose that a 16 percent improvement in intelligence creates a machine capable of making an 8 percent improvement, which in turn creates a 4 percent improvement, and so on. This process reaches a limit at about 36 percent above the original level. For more discussion on these issues, see Eliezer Yudkowsky, “Intelligence explosion microeconomics,” technical report 2013-1, Machine Intelligence Research Institute, 2013.

14 .有关人工智能中人类变得无关紧要的观点,请参阅 Hans Moravec 的《心灵儿童:机器人和人类智能的未来》(哈佛大学出版社,1988 年)。另请参阅 Hans Moravec 的《机器人:从单纯的机器到超越心灵》(牛津大学出版社,2000 年)。

14. For a view of AI in which humans become irrelevant, see Hans Moravec, Mind Children: The Future of Robot and Human Intelligence (Harvard University Press, 1988). See also Hans Moravec, Robot: Mere Machine to Transcendent Mind (Oxford University Press, 2000).

第六章

CHAPTER 6

1.一份严肃的出版物对博斯特罗姆的《超级智能:路径、危险、策略:“聪明的齿轮”》进行了严肃的评论,经济学人》,2014 年 8 月 9 日。

1. A serious publication provides a serious review of Bostrom’s Superintelligence: Paths, Dangers, Strategies: “Clever cogs,” Economist, August 9, 2014.

2.关于人工智能风险的误区和误解的讨论:Scott Alexander,《人工智能研究人员谈人工智能风险》, Slate Star Codex(博客),2015 年 5 月 22 日。

2. A discussion of myths and misunderstandings concerning the risks of AI: Scott Alexander, “AI researchers on AI risk,” Slate Star Codex (blog), May 22, 2015.

3.关于智力多维度的经典著作:霍华德·加德纳,《心智框架:多元智能理论》(Basic Books,1983)。

3. The classic work on multiple dimensions of intelligence: Howard Gardner, Frames of Mind: The Theory of Multiple Intelligences (Basic Books, 1983).

4.关于智能的多个维度对于超人人工智能可能性的影响:Kevin Kelly,《超人人工智能的神话》,《连线,2017 年 4 月 25 日。

4. On the implications of multiple dimensions of intelligence for the possibility of superhuman AI: Kevin Kelly, “The myth of a superhuman AI,” Wired, April 25, 2017.

5.有证据表明黑猩猩的短期记忆比人类更好:Sana Inoue 和 Tetsuro Matsuzawa,《黑猩猩的数字工作记忆》,《当代生物学》第 17 卷(2007 年),R1004–5。

5. Evidence that chimpanzees have better short-term memory than humans: Sana Inoue and Tetsuro Matsuzawa, “Working memory of numerals in chimpanzees,” Current Biology 17 (2007), R1004–5.

6.一篇重要的早期著作质疑了基于规则的人工智能系统的前景:Hubert Dreyfus,《计算机不能做什么》(麻省理工学院出版社,1972 年)。

6. An important early work questioning the prospects for rule-based AI systems: Hubert Dreyfus, What Computers Can’t Do (MIT Press, 1972).

7.寻求意识的物理解释并对人工智能系统实现真正智能的能力提出质疑的系列书籍中的第一本:罗杰·彭罗斯的《皇帝的新脑:关于计算机、思维和物理定律》(牛津大学出版社,1989 年)。

7. The first in a series of books seeking physical explanations for consciousness and raising doubts about the ability of AI systems to achieve real intelligence: Roger Penrose, The Emperor’s New Mind: Concerning Computers, Minds, and the Laws of Physics (Oxford University Press, 1989).

8.基于不完备定理的人工智能批判复兴:Luciano Floridi,《我们应该害怕人工智能吗?》(Should we be afraid of AI?)Aeon,2016年5月9日。

8. A revival of the critique of AI based on the incompleteness theorem: Luciano Floridi, “Should we be afraid of AI?” Aeon, May 9, 2016.

9.基于中文房间论证的人工智能批判复兴:约翰·塞尔,《你的计算机所不知道的事》,《纽约书评》 ,2014年10月9日。

9. A revival of the critique of AI based on the Chinese room argument: John Searle, “What your computer can’t know,” The New York Review of Books, October 9, 2014.

10.著名人工智能研究人员的报告称,超人类人工智能可能是不可能的:Peter Stone 等人,《人工智能与 2030 年的生活》,《人工智能百年研究》,2015 年研究小组报告,2016 年。

10. A report from distinguished AI researchers claiming that superhuman AI is probably impossible: Peter Stone et al., “Artificial intelligence and life in 2030,” One Hundred Year Study on Artificial Intelligence, report of the 2015 Study Panel, 2016.

11.基于吴恩达 (Andrew Ng) 对人工智能风险的驳斥的新闻文章:Chris Williams,《人工智能大师吴恩达:担心杀手机器人的崛起就像担心火星人口过剩》, Register 2015 年 3 月 19 日。

11. News article based on Andrew Ng’s dismissal of risks from AI: Chris Williams, “AI guru Ng: Fearing a rise of killer robots is like worrying about overpopulation on Mars,” Register, March 19, 2015.

12. “专家最了解”论调的一个例子:Oren Etzioni,“现在是时候明智地讨论人工智能了”, Backchannel ,2014 年 129 日。

12. An example of the “experts know best” argument: Oren Etzioni, “It’s time to intelligently discuss artificial intelligence,” Backchannel, December 9, 2014.

13.新闻文章称,真正的人工智能研究人员不屑谈论风险:Erik Sofge,《比尔盖茨担心人工智能,但人工智能研究人员更了解情况》《大众科学》,2015 年 1 月 30 日。

13. News article claiming that real AI researchers dismiss talk of risks: Erik Sofge, “Bill Gates fears AI, but AI researchers know better,” Popular Science, January 30, 2015.

14.另一种说法是,真正的人工智能研究人员忽视了人工智能的风险:David Kenny,《IBM 致国会关于人工智能的公开信》,2017 年 6 月 27 日, ibm.com/ blogs/policy/ kenny -artificial-intelligence-letter 。

14. Another claim that real AI researchers dismiss AI risks: David Kenny, “IBM’s open letter to Congress on artificial intelligence,” June 27, 2017, ibm.com/blogs/policy/kenny-artificial-intelligence-letter.

15.提议自愿限制基因工程的研讨会报告:Paul Berg 等,《阿西洛马重组 DNA 分子会议总结声明》,《美国国家科学院院刊》72(1975 年):1981-84 年。

15. Report from the workshop that proposed voluntary restrictions on genetic engineering: Paul Berg et al., “Summary statement of the Asilomar Conference on Recombinant DNA Molecules,” Proceedings of the National Academy of Sciences 72 (1975): 1981–84.

16. CRISPR-Cas9 基因编辑发明引发的政策声明:人类基因编辑国际峰会组委会,《关于人类基因编辑:国际峰会声明》,2015 年 12 月 3 日

16. Policy statement arising from the invention of CRISPR-Cas9 for gene editing: Organizing Committee for the International Summit on Human Gene Editing, “On human gene editing: International Summit statement,” December 3, 2015.

17.顶尖生物学家的最新政策声明:Eric Lander 等人,“暂停可遗传基因组编辑”, Nature 567(2019):165–68。

17. The latest policy statement from leading biologists: Eric Lander et al., “Adopt a moratorium on heritable genome editing,” Nature 567 (2019): 165–68.

18 . Etzioni 认为,如果不提及好处,就不能提及风险,这与他对 AI 研究人员的调查数据的分析同时出现:Oren Etzioni,“不,专家们不认为超级智能 AI 是对人类的威胁”,《麻省理工技术评论》,2016 年 9 月 20 日。他在分析中指出,任何认为超人类 AI 需要超过 25 年时间的人(包括本文作者和 Nick Bostrom)都不关心 AI 的风险。

18. Etzioni’s comment that one cannot mention risks if one does not also mention benefits appears alongside his analysis of survey data from AI researchers: Oren Etzioni, “No, the experts don’t think superintelligent AI is a threat to humanity,” MIT Technology Review, September 20, 2016. In his analysis he argues that anyone who expects superhuman AI to take more than twenty-five years—which includes this author as well as Nick Bostrom—is not concerned about the risks of AI.

19.一篇引用马斯克与扎克伯格“辩论”语录的新闻文章:Alanna Petroff,《伊隆·马斯克称马克·扎克伯格对人工智能的理解‘有限’》, CNN Money,2017 年 7 月 25 日。

19. A news article with quotations from the Musk–Zuckerberg “debate”: Alanna Petroff, “Elon Musk says Mark Zuckerberg’s understanding of AI is ‘limited,’” CNN Money, July 25, 2017.

20 . 2015 年,信息技术与创新基金会组织了一场辩论,题为“超级智能计算机真的对人类构成威胁吗?”基金会主任罗伯特·阿特金森 (Robert Atkinson) 表示,提及风险可能会导致人工智能的资金减少。视频可在itif.org/events/2015/06/30/are-super-intelligent-computers-really-threat-humanity上找到;相关讨论从 41:30 开始。

20. In 2015 the Information Technology and Innovation Foundation organized a debate titled “Are super intelligent computers really a threat to humanity?” Robert Atkinson, director of the foundation, suggests that mentioning risks is likely to result in reduced funding for AI. Video available at itif.org/events/2015/06/30/are-super-intelligent-computers-really-threat-humanity; the relevant discussion begins at 41:30.

21.有人声称,我们的安全文化将在不提及的情况下解决人工智能控制问题:史蒂芬·平克,《技术预言和被低估的思想因果力量》,《可能的思维:看待人工智能的 25 种方式》,约翰·布罗克曼主编(企鹅出版社,2019 年)。

21. A claim that our culture of safety will solve the AI control problem without ever mentioning it: Steven Pinker, “Tech prophecy and the underappreciated causal power of ideas,” in Possible Minds: Twenty-Five Ways of Looking at AI, ed. John Brockman (Penguin Press, 2019).

22.有关 Oracle AI 的有趣分析,请参阅 Stuart Armstrong、Anders Sandberg 和 Nick Bostrom 的《在盒子里思考:控制和使用 Oracle AI》,《 Minds and Machines》22(2012 年):299–324。

22. For an interesting analysis of Oracle AI, see Stuart Armstrong, Anders Sandberg, and Nick Bostrom, “Thinking inside the box: Controlling and using an Oracle AI,” Minds and Machines 22 (2012): 299–324.

23.关于人工智能为何不会抢走工作的观点:Kenny,《IBM 的公开信》

23. Views on why AI is not going to take away jobs: Kenny, “IBM’s open letter.”

24.库兹韦尔对人类大脑与人工智能融合持积极看法的一个例子:雷·库兹韦尔,接受鲍勃·皮萨尼采访,2015 年 6 月 5 日,纽约指数金融峰会。

24. An example of Kurzweil’s positive views of merging human brains with AI: Ray Kurzweil, interview by Bob Pisani, June 5, 2015, Exponential Finance Summit, New York, NY.

25.引用伊隆·马斯克关于神经织网的文章:蒂姆·厄本,《Neuralink 和大脑的神奇未来》,Wait But Why,2017 年 4 月 20 日。

25. Article quoting Elon Musk on neural lace: Tim Urban, “Neuralink and the brain’s magical future,” Wait But Why, April 20, 2017.

26.有关伯克利神经尘埃项目的最新进展,请参阅 David Piech 等人的《StimDust:具有超声波功率和通信功能的 1.7 立方毫米可植入无线精密神经刺激器》,arXiv:1807.07590(2018 年

26. For the most recent developments in Berkeley’s neural dust project, see David Piech et al., “StimDust: A 1.7 mm3, implantable wireless precision neural stimulator with ultrasonic power and communication,” arXiv: 1807.07590 (2018).

27 .苏珊·施耐德 (Susan Schneider) 在《人造你:人工智能和你思维的未来》 (普林斯顿大学出版社,2019 年) 中指出,人们对上传和神经假体等拟议技术无知的风险:由于对电子设备是否具有意识没有任何真正的理解,并且考虑到对持久个人身份的哲学困惑持续存在,我们可能会无意中结束我们自己的意识存在,或者在没有意识到有意识的机器具有意识的情况下给它们带来痛苦。

27. Susan Schneider, in Artificial You: AI and the Future of Your Mind (Princeton University Press, 2019), points out the risks of ignorance in proposed technologies such as uploading and neural prostheses: that, absent any real understanding of whether electronic devices can be conscious and given the continuing philosophical confusion over persistent personal identity, we may inadvertently end our own conscious existences or inflict suffering on conscious machines without realizing that they are conscious.

28. Yann LeCun 就人工智能风险接受采访:Guia Marie Del Prado,《Facebook 的人工智能专家对未来的看法》,《商业内幕》,2015 年 923 日。

28. An interview with Yann LeCun on AI risks: Guia Marie Del Prado, “Here’s what Facebook’s artificial intelligence expert thinks about the future,” Business Insider, September 23, 2015.

29.对因睾丸激素过量而导致的人工智能控制问题的诊断:史蒂芬·平克,《思考并不意味着征服》,摘自《如何看待会思考的机器》 ,约翰·布罗克曼主编(Harper Perennial,2015 年)。

29. A diagnosis of AI control problems arising from an excess of testosterone: Steven Pinker, “Thinking does not imply subjugating,” in What to Think About Machines That Think, ed. John Brockman (Harper Perennial, 2015).

30.一部关于许多哲学话题的开创性著作,包括道德义务是否可以在自然世界中被感知的问题:大卫休谟,《人性论》 (约翰·努恩,1738 年)。

30. A seminal work on many philosophical topics, including the question of whether moral obligations may be perceived in the natural world: David Hume, A Treatise of Human Nature (John Noon, 1738).

31.一个足够智能的机器不得不追求人类的目标:Rodney Brooks,《人工智能预测的七宗罪》,《麻省理工技术评论》 ,2017 年 10 月 6 日。

31. An argument that a sufficiently intelligent machine cannot help but pursue human objectives: Rodney Brooks, “The seven deadly sins of AI predictions,” MIT Technology Review, October 6, 2017.

32.平克,“思考并不意味着征服。

32. Pinker, “Thinking does not imply subjugating.”

33.乐观地认为,人工智能的安全问题必然会得到有利于我们的解决:史蒂芬·平克,《科技预言》

33. For an optimistic view arguing that AI safety problems will necessarily be resolved in our favor: Steven Pinker, “Tech prophecy.”

34.关于人工智能风险中“怀疑论者”和“信徒”之间意想不到的一致:Alexander,《人工智能研究人员论人工智能风险》

34. On the unsuspected alignment between “skeptics” and “believers” in AI risk: Alexander, “AI researchers on AI risk.”

第七章

CHAPTER 7

1.有关详细大脑建模的指南(现已略微过时),请参阅 Anders Sandberg 和 Nick Bostrom 的《全脑模拟:路线图》,技术报告 2008-3,牛津大学人类未来研究所,2008 年。

1. For a guide to detailed brain modeling, now slightly outdated, see Anders Sandberg and Nick Bostrom, “Whole brain emulation: A roadmap,” technical report 2008-3, Future of Humanity Institute, Oxford University, 2008.

2.有关遗传编程的领先倡导者的介绍,请参阅 John Koza 的《遗传编程:基于自然选择进行计算机编程》麻省理工学院出版社,1992 年)。

2. For an introduction to genetic programming from a leading exponent, see John Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection (MIT Press, 1992).

3.与阿西莫夫机器人三定律的平行纯属巧合。

3. The parallel to Asimov’s Three Laws of Robotics is entirely coincidental.

4. Eliezer Yudkowsky 在《连贯外推意志》技术报告(Singularity Institute,2004 年)中也提出了同样的观点。Yudkowsky 认为,直接将“四大道德原则,即我们编写的程序,全部都应包含于人工智能中”是一条必将人类毁灭的道路。他提出的“人类连贯外推意志”概念与第一原则有着相同的总体风格;其理念是,超级智能人工智能系统能够弄清楚人类集体真正想要什么。

4. The same point is made by Eliezer Yudkowsky, “Coherent extrapolated volition,” technical report, Singularity Institute, 2004. Yudkowsky argues that directly building in “Four Great Moral Principles That Are All We Need to Program into AIs” is a sure road to ruin for humanity. His notion of the “coherent extrapolated volition of humankind” has the same general flavor as the first principle; the idea is that a superintelligent AI system could work out what humans, collectively, really want.

5.您当然可以对机器是否帮助您实现您的偏好或您通过自己的努力实现这些偏好有偏好。例如,假设在其他所有条件相同的情况下,您更喜欢结果 A 而不是结果 B。您无法独立实现结果 A,但您仍然更喜欢 B 而不是借助机器获得 A。在这种情况下,机器应该决定不帮助您——除非它能以一种您完全无法察觉的方式帮助您。当然,您可能对不可察觉的帮助和可察觉的帮助都有偏好。

5. You can certainly have preferences over whether a machine is helping you achieve your preferences or you are achieving them through your own efforts. For example, suppose you prefer outcome A to outcome B, all other things being equal. You are unable to achieve outcome A unaided, and yet you still prefer B to getting A with the machine’s help. In that case the machine should decide not to help you—unless perhaps it can do so in a way that is completely undetectable by you. You may, of course, have preferences about undetectable help as well as detectable help.

6. “最大多数人的最大利益”这一短语源自弗朗西斯·哈奇森 (Francis Hutcheson) 的著作《美与美德观念的起源探究:两篇论文》(D. Midwinter 等,1725 年)。有些人认为这一表述源自威廉·莱布尼茨的早期评论;参见 Joachim Hruschka,《最大幸福原则和德国早期对功利主义理论的其他预期》,《 Utilitas》 3(1991 年):165-77。

6. The phrase “the greatest good of the greatest number” originates in the work of Francis Hutcheson, An Inquiry into the Original of Our Ideas of Beauty and Virtue, In Two Treatises (D. Midwinter et al., 1725). Some have ascribed the formulation to an earlier comment by Wilhelm Leibniz; see Joachim Hruschka, “The greatest happiness principle and other early German anticipations of utilitarian theory,” Utilitas 3 (1991): 165–77.

7有人可能会建议机器在其目标函数中既包括人类也包括动物。如果这些术语的权重与人们对动物的关心程度相对应,那么最终结果将与机器仅通过关心关心动物的人类来关心动物相同。在机器的目标函数中给予每种活着的动物同等的权重肯定会带来灾难性的后果——例如,南极磷虾的数量是人类的五万倍,细菌的数量是人类的十亿万倍。

7. One might propose that the machine should include terms for animals as well as humans in its own objective function. If these terms have weights that correspond to how much people care about animals, then the end result will be the same as if the machine cares about animals only through caring about humans who care about animals. Giving each living animal equal weight in the machine’s objective function would certainly be catastrophic—for example, we are outnumbered fifty thousand to one by Antarctic krill and a billion trillion to one by bacteria.

8.道德哲学家托比·奥德 (Toby Ord) 在对本书初稿的评论中向我提出了同样的观点:“有趣的是,道德哲学研究也存在同样的问题。直到最近,道德哲学才几乎完全忽视了结果的道德价值的不确定性。尽管正是我们对道德问题的不确定性导致人们向他人寻求道德建议,甚至进行道德哲学研究!”

8. The moral philosopher Toby Ord made the same point to me in his comments on an early draft of this book: “Interestingly, the same is true in the study of moral philosophy. Uncertainty about moral value of outcomes was almost completely neglected in moral philosophy until very recently. Despite the fact that it is our uncertainty of moral matters that leads people to ask others for moral advice and, indeed, to do research on moral philosophy at all!”

9.不关注偏好不确定性的一个借口是,它在形式上等同于普通的不确定性,其含义如下:不确定我喜欢什么,就等于确定我喜欢喜欢的东西,但不确定什么东西是喜欢的。这只是一个把不确定性转移到世界上的把戏,它把“我喜欢”变成了一个物体的属性,而不是我的属性。在博弈论中,这个把戏自 20 世纪 60 年代以来就已彻底制度化,我的已故同事、诺贝尔奖获得者约翰·哈萨尼发表了一系列论文:“‘贝叶斯’玩家玩的不完全信息博弈,第一至第三部分”,《管理科学》 14(1967 年、1968 年):159-82、320-34、486-502。在决策理论中,标准参考文献如下:Richard Cyert 和 Morris de Groot,《自适应效用》,载于《预期效用假设 和阿莱悖论》,由 Maurice Allais 和 Ole Hagen 编辑(D.Reidel,1979 年)。

9. One excuse for not paying attention to uncertainty about preferences is that it is formally equivalent to ordinary uncertainty, in the following sense: being uncertain about what I like is the same as being certain that I like likable things while being uncertain about what things are likable. This is just a trick that appears to move the uncertainty into the world, by making “likability by me” a property of objects rather than a property of me. In game theory, this trick has been thoroughly institutionalized since the 1960s, following a series of papers by my late colleague and Nobel laureate John Harsanyi: “Games with incomplete information played by ‘Bayesian’ players, Parts I–III,” Management Science 14 (1967, 1968): 159–82, 320–34, 486–502. In decision theory, the standard reference is the following: Richard Cyert and Morris de Groot, “Adaptive utility,” in Expected Utility Hypotheses and the Allais Paradox, ed. Maurice Allais and Ole Hagen (D. Reidel, 1979).

10.偏好诱导领域的人工智能研究人员显然是个例外。例如,请参阅 Craig Boutilier 的《预期效用的基础》,载于第 18 届国际人工智能联合会议论文集》(Morgan Kaufmann,2003 年)。另请参阅 Alan Fern 等人的《辅助决策理论模型》,载于《人工智能研究杂志》第 50 期(2014 年):71-104。

10. AI researchers working in the area of preference elicitation are an obvious exception. See, for example, Craig Boutilier, “On the foundations of expected expected utility,” in Proceedings of the 18th International Joint Conference on Artificial Intelligence (Morgan Kaufmann, 2003). Also Alan Fern et al., “A decision-theoretic model of assistance,” Journal of Artificial Intelligence Research 50 (2014): 71–104.

11.基于对记者在杂志文章中对作者的简短采访的误解而对有益人工智能的批评:Adam Elkus,《如何变得优秀:为什么你不能将人类价值观教给人工智能》, Slate 2016 年 4 月 20 日。

11. A critique of beneficial AI based on a misinterpretation of a journalist’s brief interview with the author in a magazine article: Adam Elkus, “How to be good: Why you can’t teach human values to artificial intelligence,” Slate, April 20, 2016.

12.电车难题的起源:Frank Sharp,《习俗对道德判断影响的研究》,威斯康星大学公报236(1908)。

12. The origin of trolley problems: Frank Sharp, “A study of the influence of custom on the moral judgment,” Bulletin of the University of Wisconsin 236 (1908).

13 “反生育主义”运动认为,人类繁衍后代在道德上是错误的,因为活着就是受苦,而且人类对地球的影响是极其负面的。如果你认为人类的存在是一个道德困境,那么我想我确实希望机器以正确的方式解决这个道德困境。

13. The “anti-natalist” movement believes it is morally wrong for humans to reproduce because to live is to suffer and because humans’ impact on the Earth is profoundly negative. If you consider the existence of humanity to be a moral dilemma, then I suppose I do want machines to resolve this moral dilemma the right way.

14 .全国人大外事委员会副主任委员傅莹就中国人工智能政策发表讲话。中国国家主席习近平在致 2018 年上海世界人工智能大会的信中写道:“要深化国际合作,应对法律、安全、就业、道德、治理等领域的新问题。”我非常感谢谢志雄提醒我注意这些言论。

14. Statement on China’s AI policy by Fu Ying, vice chair of the Foreign Affairs Committee of the National People’s Congress. In a letter to the 2018 World AI Conference in Shanghai, Chinese president Xi Jinping wrote, “Deepened international cooperation is required to cope with new issues in fields including law, security, employment, ethics and governance.” I am indebted to Brian Tse for bringing these statements to my attention.

15.一篇关于非自然主义非谬误的非常有趣的论文,展示了如何从人类安排的世界状态中推断出偏好:Rohin Shah 等人,“初始状态下的隐性偏好信息”,载于《第七届国际学习表征会议论文集》(2019 年), iclr.cc/ Conferences/2019/Schedule 。

15. A very interesting paper on the non-naturalistic non-fallacy, showing how preferences can be inferred from the state of the world as arranged by humans: Rohin Shah et al., “The implicit preference information in an initial state,” in Proceedings of the 7th International Conference on Learning Representations (2019), iclr.cc/Conferences/2019/Schedule.

16.回顾阿西洛马事件:保罗·伯格,“1975 年阿西洛马事件:DNA 修饰获得成功”,自然》 455(2008):290-91。

16. Retrospective on Asilomar: Paul Berg, “Asilomar 1975: DNA modification secured,” Nature 455 (2008): 290–91.

17.报道普京关于人工智能讲话的新闻文章:《普京:人工智能领袖将统治世界》,美联社,2017 年 9 月 4 日。

17. News article reporting Putin’s speech on AI: “Putin: Leader in artificial intelligence will rule world,” Associated Press, September 4, 2017.

第八章

CHAPTER 8

1.费马大定理断言,当a bc为整数且n为大于 2 的整数时,方程 a n = b n + c 不存在解。费马在其所著的《丢番图算术》的页边空白处写道一个关于这个命题真正绝妙的证明,但是这个空白处太窄,无法容纳下它。”不管真假,这句话确保了数学家在随后的几个世纪里会积极寻求证明。我们可以很容易地检查一些具体情况 — — 例如,7 3是否等于 6 3 + 5 3?(几乎,因为 7 3等于 343 而 6 3 + 5 3等于 341,但“几乎”不算。)当然,有无数种情况需要检查,这就是为什么我们需要数学家而不仅仅是计算机程序员。

1. Fermat’s Last Theorem asserts that the equation an = bn + cn has no solutions with a, b, and c being whole numbers and n being a whole number larger than 2. In the margin of his copy of Diophantus’s Arithmetica, Fermat wrote, “I have a truly marvellous proof of this proposition which this margin is too narrow to contain.” True or not, this guaranteed that mathematicians pursued a proof with vigor in the subsequent centuries. We can easily check particular cases—for example, is 73 equal to 63 + 53? (Almost, because 73 is 343 and 63 + 53 is 341, but “almost” doesn’t count.) There are, of course, infinitely many cases to check, and that’s why we need mathematicians and not just computer programmers.

2.机器智能研究所的一篇论文提出了许多相关问题:Scott Garrabrant 和 Abram Demski,《嵌入式代理》,AI Alignment Forum,2018 年 11 月 15 日

2. A paper from the Machine Intelligence Research Institute poses many related issues: Scott Garrabrant and Abram Demski, “Embedded agency,” AI Alignment Forum, November 15, 2018.

3.多属性效用理论的经典著作:Ralph Keeney 和 Howard Raiffa, Decisions with Multiple Objectives: Preferences and Value Tradeoffs (Wiley, 1976)。

3. The classic work on multiattribute utility theory: Ralph Keeney and Howard Raiffa, Decisions with Multiple Objectives: Preferences and Value Tradeoffs (Wiley, 1976).

4.介绍逆向 RL 思想的论文:Stuart Russell,《不确定环境下的学习代理》,载于11 届计算学习理论年会论文集(ACM,1998 年)。

4. Paper introducing the idea of inverse RL: Stuart Russell, “Learning agents for uncertain environments,” in Proceedings of the 11th Annual Conference on Computational Learning Theory (ACM, 1998).

5.关于马尔可夫决策过程结构估计的原始论文:Thomas Sargent,“理性预期下的动态劳动力需求计划估计”,政治经济学杂志86(1978):1009-44。

5. The original paper on structural estimation of Markov decision processes: Thomas Sargent, “Estimation of dynamic labor demand schedules under rational expectations,” Journal of Political Economy 86 (1978): 1009–44.

6. IRL 的第一个算法:Andrew Ng 和 Stuart Russell,《逆向强化学习算法》,载于《第 17 届国际机器学习会议论文集》,Pat Langley 编辑(Morgan Kaufmann,2000 年)。

6. The first algorithms for IRL: Andrew Ng and Stuart Russell, “Algorithms for inverse reinforcement learning,” in Proceedings of the 17th International Conference on Machine Learning, ed. Pat Langley (Morgan Kaufmann, 2000).

7.更好的逆向强化学习算法:Pieter Abbeel 和 Andrew Ng,《通过逆向强化学习进行学徒学习》,载于《第 21 届国际机器学习会议论文集》,由 Russ Greiner 和 Dale Schuurmans 编辑(ACM Press,2004 年)。

7. Better algorithms for inverse RL: Pieter Abbeel and Andrew Ng, “Apprenticeship learning via inverse reinforcement learning,” in Proceedings of the 21st International Conference on Machine Learning, ed. Russ Greiner and Dale Schuurmans (ACM Press, 2004).

8.将逆强化学习理解为贝叶斯更新:Deepak Ramachandran 和 Eyal Amir,《贝叶斯逆强化学习》,载于《第 20 届国际人工智能联合会议论文集》 Manuela Veloso 编辑(AAAI Press,2007 年)

8. Understanding inverse RL as Bayesian updating: Deepak Ramachandran and Eyal Amir, “Bayesian inverse reinforcement learning,” in Proceedings of the 20th International Joint Conference on Artificial Intelligence, ed. Manuela Veloso (AAAI Press, 2007).

9.如何教导直升机飞行和进行特技飞行:Adam Coates、Pieter Abbeel 和 Andrew Ng,《直升机控制学徒学习》,ACM 通讯》 52(2009 年):97-105。

9. How to teach helicopters to fly and do aerobatic maneuvers: Adam Coates, Pieter Abbeel, and Andrew Ng, “Apprenticeship learning for helicopter control,” Communications of the ACM 52 (2009): 97–105.

10 .最初提出的辅助游戏名称是合作逆向强化学习游戏,或 CIRL 游戏。请参阅 Dylan Hadfield-Menell 等人的《合作逆向强化学习》,载于《神经信息处理系统进展》29,Daniel Lee 等人编辑(2016 年)。

10. The original name proposed for an assistance game was a cooperative inverse reinforcement learning game, or CIRL game. See Dylan Hadfield-Menell et al., “Cooperative inverse reinforcement learning,” in Advances in Neural Information Processing Systems 29, ed. Daniel Lee et al. (2016).

11.选择这些数字只是为了让游戏更有趣

11. These numbers are chosen just to make the game interesting.

12.可以通过称为迭代最佳响应的过程找到游戏的均衡解决方案:为 Harriet 选择任何策略;给定 Harriet 的策略,为 Robbie 选择最佳策略;给定 Robbie 的策略,为 Harriet 选择最佳策略;依此类推。如果此过程达到一个固定点,其中两种策略都不会改变,那么我们就找到了解决方案。该过程展开如下:

12. The equilibrium solution to the game can be found by a process called iterated best response: pick any strategy for Harriet; pick the best strategy for Robbie, given Harriet’s strategy; pick the best strategy for Harriet, given Robbie’s strategy; and so on. If this process reaches a fixed point, where neither strategy changes, then we have found a solution. The process unfolds as follows:

  1. 从哈丽特的贪婪策略开始:如果她喜欢回形针,就做 2 个回形针;如果她无所谓,就每种都做 1 个;如果她喜欢订书钉,就做 2 个订书钉。

  2. Start with the greedy strategy for Harriet: make 2 paperclips if she prefers paperclips; make 1 of each if she is indifferent; make 2 staples if she prefers staples.

  3. 考虑到 Harriet 的这一策略,Robbie 必须考虑三种可能性:

  4. There are three possibilities Robbie has to consider, given this strategy for Harriet:

  1. 如果罗比看到哈丽特做了 2 个回形针,他就会推断她更喜欢回形针,所以他现在认为回形针的价值均匀分布在 50 美分和 1.00 美元之间,平均为 75 美分。在这种情况下,他的最佳计划是为哈丽特制作 90 个回形针,预期价值为 67.50 美元。

  2. If Robbie sees Harriet make 2 paperclips, he infers that she prefers paperclips, so he now believes the value of a paperclip is uniformly distributed between 50¢ and $1.00, with an average of 75¢. In that case, his best plan is to make 90 paperclips with an expected value of $67.50 for Harriet.

  3. 如果罗比看到哈丽特各做了 1 个,他就会推断哈丽特对回形针和订书钉的价值评价为 50 美分,因此最好的选择是各做 50 个。

  4. If Robbie sees Harriet make 1 of each, he infers that she values paperclips and staples at 50¢, so the best choice is to make 50 of each.

  5. 如果 Robbie 看到 Harriet 订了 2 个钉书钉,那么根据与 2(a)中相同的论点,他应该订了 90 个钉书钉。

  6. If Robbie sees Harriet make 2 staples, then by the same argument as in 2(a), he should make 90 staples.

  7. 鉴于罗比的这一策略,哈里特的最佳策略现在与步骤 1 中的贪婪策略略有不同:如果罗比要对她制作 1 个回形针做出回应,那就是制作 50 个回形针,那么她最好制作 1 个回形针,这不仅是在她完全无所谓的情况下,而且是在她接近无所谓的情况下。事实上,如果她对回形针的估价介于 44.6¢ 和 55.4¢ 之间,那么最佳策略现在是制作 1 个回形针。

  8. Given this strategy for Robbie, Harriet’s best strategy is now somewhat different from the greedy strategy in step 1: if Robbie is going to respond to her making 1 of each by making 50 of each, then she is better off making 1 of each not just if she is exactly indifferent but if she is anywhere close to indifferent. In fact, the optimal policy is now to make 1 of each if she values paperclips anywhere between about 44.6¢ and 55.4¢.

  9. 考虑到 Harriet 的这一新策略,Robbie 的策略保持不变。例如,如果她选择每种 1 个,他推断回形针的价值均匀分布在 44.6¢ 和 55.4¢ 之间,平均为 50¢,因此最佳选择是每种 50 个。由于 Robbie 的策略与步骤 2 中的策略相同,因此 Harriet 的最佳反应将与步骤 3 中的策略相同,我们找到了均衡。

  10. Given this new strategy for Harriet, Robbie’s strategy remains unchanged. For example, if she chooses 1 of each, he infers that the value of a paperclip is uniformly distributed between 44.6¢ and 55.4¢, with an average of 50¢, so the best choice is to make 50 of each. Because Robbie’s strategy is the same as in step 2, Harriet’s best response will be the same as in step 3, and we have found the equilibrium.

13.有关关闭开关游戏的更完整分析,请参阅 Dylan Hadfield-Menell 等人撰写的《关闭开关游戏》,载于《26 届国际人工智能联合会议论文集》 ,由 Carles Sierra 编辑(IJCAI,2017 年)

13. For a more complete analysis of the off-switch game, see Dylan Hadfield-Menell et al., “The off-switch game,” in Proceedings of the 26th International Joint Conference on Artificial Intelligence, ed. Carles Sierra (IJCAI, 2017).

14 .如果您不介意整数符号,一般结果的证明非常简单。假设P ( u ) 为 Robbie 对 Harriet 提出的行动a的效用的先验概率密度。那么继续执行a的价值为

14. The proof of the general result is quite simple if you don’t mind integral signs. Let P(u) be Robbie’s prior probability density over Harriet’s utility for the proposed action a. Then the value of going ahead with a is

(我们很快就会看到为什么积分以这种方式拆分。)另一方面,动作d的值(推迟到 Harriet)由两部分组成:如果u > 0,则 Harriet 让 Robbie 继续前进,因此值为u,但如果u < 0,则 Harriet 会关闭 Robbie,因此值为 0:

(We will see shortly why the integral is split up in this way.) On the other hand, the value of action d, deferring to Harriet, is composed of two parts: if u > 0, then Harriet lets Robbie go ahead, so the value is u, but if u < 0, then Harriet switches Robbie off, so the value is 0:

比较EU ( a ) 和EU ( d )的表达式,我们立即发现EU ( d ) ≥ EU ( a ),因为EU ( d )的表达式将负效用区域归零。只有当负区域的概率为零时,即当 Robbie 已经确定 Harriet 喜欢所提议的行动时,这两个选择才具有相等的价值。该定理与众所周知的非负信息期望值定理直接类似。

Comparing the expressions for EU(a) and EU(d), we see immediately that EU(d) ≥ EU(a) because the expression for EU(d) has the negative-utility region zeroed out. The two choices have equal value only when the negative region has zero probability—that is, when Robbie is already certain that Harriet likes the proposed action. The theorem is a direct analog of the well-known theorem concerning the non-negative expected value of information.

15.对于一个人和一个机器人的情况,下一个要阐述的可能是考虑一个哈丽特,她还不知道自己对世界某些方面的偏好,或者她的偏好还没有形成

15. Perhaps the next elaboration in line, for the one human–one robot case, is to consider a Harriet who does not yet know her own preferences regarding some aspect of the world, or whose preferences have not yet been formed.

16.要了解罗比究竟是如何收敛到错误信念的,请考虑一个模型,其中哈里特略微不理性,犯错的概率随着错误大小的增加而呈指数下降。罗比提出用 4 个回形针换 1 个订书钉,但哈里特拒绝了。根据罗比的信念,这是不理性的:即使每个回形针 25 美分,每个订书钉 75 美分,她也应该接受 4 个换 1 个。因此,她肯定犯了一个错误——但如果她的真实价值是 25 美分,那么这个错误发生的可能性要比30 美分大得多,因为如果回形针的价值是 30 美分,这个错误会让她付出更大的代价。现在,罗比的概率分布将 25 美分作为最可能的值,因为它代表了哈里特最小的错误,高于 25 美分的值的概率呈指数下降。如果他继续尝试相同的实验,概率分布将越来越集中于 25¢ 附近。在极限情况下,罗比确信哈丽特的回形针价值为 25¢。

16. To see how exactly Robbie converges to an incorrect belief, consider a model in which Harriet is slightly irrational, making errors with a probability that diminishes exponentially as the size of error increases. Robbie offers Harriet 4 paperclips in return for 1 staple; she refuses. According to Robbie’s beliefs, this is irrational: even at 25¢ per paperclip and 75¢ per staple, she should accept 4 for 1. Therefore, she must have made a mistake—but this mistake is much more likely if her true value is 25¢ than if it is, say, 30¢, because the error costs her a lot more if her value for paperclips is 30¢. Now Robbie’s probability distribution has 25¢ as the most likely value because it represents the smallest error on Harriet’s part, with exponentially lower probabilities for values higher than 25¢. If he keeps trying the same experiment, the probability distribution becomes more and more concentrated close to 25¢. In the limit, Robbie becomes certain that Harriet’s value for paperclips is 25¢.

17.例如,罗比对汇率的先验信念可以呈正态(高斯)分布,范围从 −∞ 到 +∞。

17. Robbie could, for example, have a normal (Gaussian) distribution for his prior belief about the exchange rate, which stretches from −∞ to +∞.

18.有关可能需要的数学分析类型的示例,请参阅 Avrim Blum、Lisa Hellerstein 和 Nick Littlestone 的《在存在有限或无限多个不相关属性的情况下进行学习》,《计算机与系统科学杂志》 50(1995 年):32-40。另请参阅 Lori Dalton 的《最佳贝叶斯特征选择》,《 2013 年 IEEE 全球信号和信息处理会议论文集》,由 Charles Bouman、Robert Nowak 和 Anna Scaglione 编辑(IEEE,2013 年)。

18. For an example of the kind of mathematical analysis that may be needed, see Avrim Blum, Lisa Hellerstein, and Nick Littlestone, “Learning in the presence of finitely or infinitely many irrelevant attributes,” Journal of Computer and System Sciences 50 (1995): 32–40. Also Lori Dalton, “Optimal Bayesian feature selection,” in Proceedings of the 2013 IEEE Global Conference on Signal and Information Processing, ed. Charles Bouman, Robert Nowak, and Anna Scaglione (IEEE, 2013).

19.在这里,我稍微改述了 Moshe Vardi 在 2017 年阿西洛马有益人工智能会议上提出的一个问题。

19. Here I am rephrasing slightly a question by Moshe Vardi at the Asilomar Conference on Beneficial AI, 2017.

20. Michael Wellman 和 Jon Doyle,《目标的优先语义》,载于《第九届全国人工智能大会论文集》(AAAI Press,1991 年)。本文借鉴了 Georg von Wright 的早期提案《重新考虑偏好逻辑》,载于理论与决策》第 3 卷(1972 年):140-67 页。

20. Michael Wellman and Jon Doyle, “Preferential semantics for goals,” in Proceedings of the 9th National Conference on Artificial Intelligence (AAAI Press, 1991). This paper draws on a much earlier proposal by Georg von Wright, “The logic of preference reconsidered,” Theory and Decision 3 (1972): 140–67.

21.我已故的伯克利同事因成为形容词而出名。参见 Paul Grice, 《词语之道研究》(哈佛大学出版社,1989 年)。

21. My late Berkeley colleague has the distinction of becoming an adjective. See Paul Grice, Studies in the Way of Words (Harvard University Press, 1989).

22.直接刺激大脑愉悦中枢的原始论文:James Olds 和 Peter Milner,《电刺激大鼠脑隔区和其他区域产生的正强化》,《比较与生理心理学杂志》 47(1954):419–27。

22. The original paper on direct stimulation of pleasure centers in the brain: James Olds and Peter Milner, “Positive reinforcement produced by electrical stimulation of septal area and other regions of rat brain,” Journal of Comparative and Physiological Psychology 47 (1954): 419–27.

23.让老鼠按下按钮:James Olds,《大脑的自我刺激;其用于研究饥饿、性和药物的局部影响》,科学》 127期(1958):315-24。

23. Letting rats push the button: James Olds, “Self-stimulation of the brain; its use to study local effects of hunger, sex, and drugs,” Science 127 (1958): 315–24.

24.让人类按下按钮:罗伯特·希思,《人类大脑的电自我刺激》,美国精神病学杂志》 120(1963):571-77。

24. Letting humans push the button: Robert Heath, “Electrical self-stimulation of the brain in man,” American Journal of Psychiatry 120 (1963): 571–77.

25.首次对脑电波进行数学处理,展示了脑电波在强化学习代理中的发生方式:Mark Ring 和 Laurent Orseau,《妄想、生存和智能代理》,《人工智能:第四届国际会议》,Jürgen Schmidhuber、Kristinn Thórisson 和 Moshe Looks 编辑(Springer,2011 年)。脑电波问题的一个可能解决方案:Tom Everitt 和 Marcus Hutter,《通过价值强化学习避免脑电波》,arXiv:1605.03143(2016 年)。

25. A first mathematical treatment of wireheading, showing how it occurs in reinforcement learning agents: Mark Ring and Laurent Orseau, “Delusion, survival, and intelligent agents,” in Artificial General Intelligence: 4th International Conference, ed. Jürgen Schmidhuber, Kristinn Thórisson, and Moshe Looks (Springer, 2011). One possible solution to the wireheading problem: Tom Everitt and Marcus Hutter, “Avoiding wireheading with value reinforcement learning,” arXiv:1605.03143 (2016).

26.智能爆炸如何才能安全地发生:Benja Fallenstein 和 Nate Soares,《Vingean 反思:自我完善智能体的可靠推理》,技术报告 2015-2,机器智能研究所,2015 年

26. How it might be possible for an intelligence explosion to occur safely: Benja Fallenstein and Nate Soares, “Vingean reflection: Reliable reasoning for self-improving agents,” technical report 2015-2, Machine Intelligence Research Institute, 2015.

27.智能体在推理自身及其继任者时面临的困难:Benja Fallenstein 和 Nate Soares,《自我改进时空嵌入式智能中的自我参照问题》,《通用人工智能:第七届国际会议》,Ben Goertzel、 Laurent Orseau 和 Javier Snaider 编辑(Springer,2014 年)。

27. The difficulty agents face in reasoning about themselves and their successors: Benja Fallenstein and Nate Soares, “Problems of self-reference in self-improving space-time embedded intelligence,” in Artificial General Intelligence: 7th International Conference, ed. Ben Goertzel, Laurent Orseau, and Javier Snaider (Springer, 2014).

28 .说明为什么当代理的计算能力有限时,代理可能会追求不同于其真实目标的目标:Jonathan Sorg、Satinder Singh 和 Richard Lewis,《内部奖励减轻代理的局限性》,载于《第 27 届国际机器学习会议论文集》,Johannes Fürnkranz 和 Thorsten Joachims 编辑(2010 年), icml.cc/ Conferences/2010/papers/icml2010proceedings.zip 。

28. Showing why an agent might pursue an objective different from its true objective if its computational abilities are limited: Jonathan Sorg, Satinder Singh, and Richard Lewis, “Internal rewards mitigate agent boundedness,” in Proceedings of the 27th International Conference on Machine Learning, ed. Johannes Fürnkranz and Thorsten Joachims (2010), icml.cc/Conferences/2010/papers/icml2010proceedings.zip.

第九章

CHAPTER 9

1.有人认为生物学和神经科学也直接相关。例如,请参阅 Gopal Sarma、Adam Safron 和 Nick Hay 的《综合生物模拟、神经心理学和 AI 安全》, arxiv.org/abs / 1811.03493 (2018)。

1. Some have argued that biology and neuroscience are also directly relevant. See, for example, Gopal Sarma, Adam Safron, and Nick Hay, “Integrative biological simulation, neuropsychology, and AI safety,” arxiv.org/abs/1811.03493 (2018).

2.关于让计算机承担损害赔偿责任的可能性:Paulius Čerka、Jurgita Grigienė 和 Gintarė Sirbikytė,“人工智能造成的损害赔偿责任”,计算机法律与安全评论31(2015 年):376-89。

2. On the possibility of making computers liable for damages: Paulius Čerka, Jurgita Grigienė, and Gintarė Sirbikytė, “Liability for damages caused by artificial intelligence,” Computer Law and Security Review 31 (2015): 376–89.

3.有关标准伦理理论及其对设计人工智能系统的影响的出色的机器导向介绍,请参阅 Wendell Wallach 和 Colin Allen 的《道德机器:教机器人明辨是非》(牛津大学出版社,2008 年)。

3. For an excellent machine-oriented introduction to standard ethical theories and their implications for designing AI systems, see Wendell Wallach and Colin Allen, Moral Machines: Teaching Robots Right from Wrong (Oxford University Press, 2008).

4.功利主义思想的源泉:杰里米·边沁,道德与立法原则导论》(T.Payne & Son,1789)。

4. The sourcebook for utilitarian thought: Jeremy Bentham, An Introduction to the Principles of Morals and Legislation (T. Payne & Son, 1789).

5.密尔对其导师边沁思想的阐释对自由主义思想影响极大:约翰·斯图尔特·密尔,功利主义》(Parker, Son & Bourn, 1863)。

5. Mill’s elaboration of his tutor Bentham’s ideas was extraordinarily influential on liberal thought: John Stuart Mill, Utilitarianism (Parker, Son & Bourn, 1863).

6.介绍偏好功利主义和偏好自主性的论文:John Harsanyi,“道德与理性行为理论”,社会研究44(1977):623–56。

6. The paper introducing preference utilitarianism and preference autonomy: John Harsanyi, “Morality and the theory of rational behavior,” Social Research 44 (1977): 623–56.

7.在代表多名个人做出决策时,通过效用的加权和来进行社会聚合的论证:约翰·哈萨尼,《基本福利、个人主义伦理和效用的人际比较》,《政治经济学杂志》第 63 卷(1955 年):第 309-21 页。

7. An argument for social aggregation via weighted sums of utilities when deciding on behalf of multiple individuals: John Harsanyi, “Cardinal welfare, individualistic ethics, and interpersonal comparisons of utility,” Journal of Political Economy 63 (1955): 309–21.

8.将哈萨尼的社会聚合定理推广到先前信念不平等的情况:Andrew Critch、Nishant Desai 和 Stuart Russell,《可协商强化学习以实现帕累托最优顺序决策》,《神经信息处理系统进展》第 31 期 Samy Bengio 等人编辑(2018 年)。

8. A generalization of Harsanyi’s social aggregation theorem to the case of unequal prior beliefs: Andrew Critch, Nishant Desai, and Stuart Russell, “Negotiable reinforcement learning for Pareto optimal sequential decision-making,” in Advances in Neural Information Processing Systems 31, ed. Samy Bengio et al. (2018).

9.理想功利主义的资料来源:GE Moore,《伦理学》(Williams & Norgate,1912)

9. The sourcebook for ideal utilitarianism: G. E. Moore, Ethics (Williams & Norgate, 1912).

10.新闻文章引用了斯图尔特·阿姆斯特朗误导性效用最大化的生动例子:克里斯·马蒂斯奇克,《教授警告机器人可能会让我们在棺材里注射海洛因》,CNET,2015 年 6 月 29 日

10. News article citing Stuart Armstrong’s colorful example of misguided utility maximization: Chris Matyszczyk, “Professor warns robots could keep us in coffins on heroin drips,” CNET, June 29, 2015.

11.波普尔的消极功利主义理论(后来由斯马特命名):卡尔·波普尔,开放社会及其敌人》(Routledge,1945)。

11. Popper’s theory of negative utilitarianism (so named later by Smart): Karl Popper, The Open Society and Its Enemies (Routledge, 1945).

12.对消极功利主义的驳斥:R. Ninian Smart,《消极功利主义》, Mind 67(1958):542–43。

12. A refutation of negative utilitarianism: R. Ninian Smart, “Negative utilitarianism,” Mind 67 (1958): 542–43.

13.有关“结束人类苦难”命令带来的风险的典型论点,请参阅“我们为什么认为人工智能会毁灭我们?”,Reddit, reddit.com/ r/Futurology /comments/38fp6o/why_do_we_think_ai_will_destroy_us 。

13. For a typical argument for risks arising from “end human suffering” commands, see “Why do we think AI will destroy us?,” Reddit, reddit.com/r/Futurology/comments/38fp6o/why_do_we_think_ai_will_destroy_us.

14.人工智能中自我欺骗激励的良好来源:Ring 和 Orseau,“欺骗、生存和智能代理”。

14. A good source for self-deluding incentives in AI: Ring and Orseau, “Delusion, survival, and intelligent agents.”

15.关于人际效用比较的不可能性:W.斯坦利·杰文斯,政治经济学理论》(麦克米伦,1871年)。

15. On the impossibility of interpersonal comparisons of utility: W. Stanley Jevons, The Theory of Political Economy (Macmillan, 1871).

16.功利怪物首次出现在罗伯特·诺齐克的无政府、国家和乌托邦》(Basic Books,1974)中。

16. The utility monster makes its appearance in Robert Nozick, Anarchy, State, and Utopia (Basic Books, 1974).

17.例如,我们可以将立即死亡的效用定为 0,将最大幸福生活的效用定为 1。参见 John Isbell,《绝对游戏》,《博弈论贡献》 ,第 4 卷,Albert Tucker 和 R. Duncan Luce 主编(普林斯顿大学出版社,1959 年)。

17. For example, we can fix immediate death to have a utility of 0 and a maximally happy life to have a utility of 1. See John Isbell, “Absolute games,” in Contributions to the Theory of Games, vol. 4, ed. Albert Tucker and R. Duncan Luce (Princeton University Press, 1959).

18 .蒂姆·哈福德 (Tim Harford) 在《金融时报》 2019 年 4 月 20 日的一篇文章中讨论了灭霸人口减半政策的过度简化性质,。早在电影上映之前,灭霸的捍卫者就开始聚集在 subreddit r/thanosdidnothingwrong/ 上。为了遵循 subreddit 的座右铭,70 万名成员中有 35 万名后来被清除。

18. The oversimplified nature of Thanos’s population-halving policy is discussed by Tim Harford, “Thanos shows us how not to be an economist,” Financial Times, April 20, 2019. Even before the film debuted, defenders of Thanos began to congregate on the subreddit r/thanosdidnothingwrong/. In keeping with the subreddit’s motto, 350,000 of the 700,000 members were later purged.

19.论不同规模人口的效用:亨利·西奇威克,伦理学方法》(麦克米伦,1874 年)。

19. On utilities for populations of different sizes: Henry Sidgwick, The Methods of Ethics (Macmillan, 1874).

20.令人厌恶的结论以及功利主义思想的其他棘手问题:德里克·帕菲特,理由与人》(牛津大学出版社,1984年)。

20. The Repugnant Conclusion and other knotty problems of utilitarian thinking: Derek Parfit, Reasons and Persons (Oxford University Press, 1984).

21.有关人口伦理公理方法的简要总结,请参阅 Peter Eckersley,“人工智能价值取向中的不可能性和不确定性定理”,载于AAAI 人工智能安全研讨会论文集》,Huáscar Espinoza 等人编辑(2019 年)。

21. For a concise summary of axiomatic approaches to population ethics, see Peter Eckersley, “Impossibility and uncertainty theorems in AI value alignment,” in Proceedings of the AAAI Workshop on Artificial Intelligence Safety, ed. Huáscar Espinoza et al. (2019).

22.计算地球的长期承载能力:Daniel O'Neill 等,“在地球边界内为所有人创造美好生活”,自然可持续性》1(2018):88–95。

22. Calculating the long-term carrying capacity of the Earth: Daniel O’Neill et al., “A good life for all within planetary boundaries,” Nature Sustainability 1 (2018): 88–95.

23.有关道德不确定性在人口伦理学中的应用,请参阅 Hilary Greaves 和 Toby Ord 合著的《关于人口价值论的道德不确定性》,《伦理与社会哲学杂志》第12 卷(2017 年):135-167 页。Will MacAskill、Krister Bykvist 和 Toby Ord 合著的《道德不确定性》 (牛津大学出版社,即将出版)提供了更全面的分析。

23. For an application of moral uncertainty to population ethics, see Hilary Greaves and Toby Ord, “Moral uncertainty about population axiology,” Journal of Ethics and Social Philosophy 12 (2017): 135–67. A more comprehensive analysis is provided by Will MacAskill, Krister Bykvist, and Toby Ord, Moral Uncertainty (Oxford University Press, forthcoming).

24以下引文表明斯密并不像人们通常想象的那样痴迷于自私:亚当·斯密,《道德情操论》(安德鲁·米勒;亚历山大·金凯德和 J. 贝尔,1759 年)。

24. Quotation showing that Smith was not so obsessed with selfishness as is commonly imagined: Adam Smith, The Theory of Moral Sentiments (Andrew Millar; Alexander Kincaid and J. Bell, 1759).

25.有关利他主义经济学的介绍,请参阅 Serge-Christophe Kolm 和 Jean Ythier 编,《赠予、利他主义和互惠经济学手册》 ,2 卷(北荷兰出版社,2006 年)。

25. For an introduction to the economics of altruism, see Serge-Christophe Kolm and Jean Ythier, eds., Handbook of the Economics of Giving, Altruism and Reciprocity, 2 vols. (North-Holland, 2006).

26.关于慈善是自私的:James Andreoni,“不纯粹的利他主义和对公共物品的捐赠:一种温暖捐赠理论”,经济学期刊100(1990):464-77。

26. On charity as selfish: James Andreoni, “Impure altruism and donations to public goods: A theory of warm-glow giving,” Economic Journal 100 (1990): 464–77.

27.对于那些喜欢方程式的人来说:让 Alice 的内在幸福感用w A来衡量,让 Bob 的内在幸福感用w B衡量。那么 Alice 和 Bob 的效用定义如下:

27. For those who like equations: let Alice’s intrinsic well-being be measured by wA and Bob’s by wB. Then the utilities for Alice and Bob are defined as follows:

U A = w A + C AB w B

UA = wA + CAB wB

U B = w B + C BA w A。

UB = wB + CBA wA.

有些作者认为,爱丽丝关心的是鲍勃的整体效用U B ,而不仅仅是他的内在福祉w B,但这会导致一种循环,因为爱丽丝的效用取决于鲍勃的效用,而鲍勃的效用又取决于爱丽丝的效用;有时可以找到稳定的解决方案,但底层模型可能会受到质疑。例如,请参阅 Hajime Hori,《非家长式利他主义和社会偏好的功能相互依赖》,《社会选择与福利》第 32 卷(2009 年):第 59-77 页。

Some authors suggest that Alice cares about Bob’s overall utility UB rather than just his intrinsic well-being wB, but this leads to a kind of circularity in that Alice’s utility depends on Bob’s utility which depends on Alice’s utility; sometimes stable solutions can be found but the underlying model can be questioned. See, for example, Hajime Hori, “Nonpaternalistic altruism and functional interdependence of social preferences,” Social Choice and Welfare 32 (2009): 59–77.

28.每个人的效用是每个人福祉的线性组合的模型只是其中一种可能性。还有更普遍的模型——例如,在这样的模型中,有些人宁愿避免福祉分配的严重不平等,即使以减少总体福祉为代价,而其他人则真的希望没有人对不平等有任何偏好。因此,我提出的总体方法可以容纳个人持有的多种道德理论;同时,它并不坚持认为其中任何一种道德理论都是正确的,也不坚持认为应该对持有不同理论的人的结果产生很大影响。我非常感谢托比·奥德指出了这种方法的这一特点。

28. Models in which each individual’s utility is a linear combination of everyone’s well-being are just one possibility. Much more general models are possible—for example, models in which some individuals prefer to avoid severe inequalities in the distribution of well-being, even at the expense of reducing the total, while other individuals would really prefer that no one have preferences about inequality at all. Thus, the overall approach I am proposing accommodates multiple moral theories held by individuals; at the same time, it doesn’t insist that any one of those moral theories is correct or should have much sway over outcomes for those who hold a different theory. I am indebted to Toby Ord for pointing out this feature of the approach.

29.此类论点反对旨在确保结果平等的政策,尤其是美国法律哲学家罗纳德·德沃金 (Ronald Dworkin)。例如,请参阅罗纳德·德沃金的《什么是平等?第一部分:福利平等》,《哲学与公共事务》第 10 卷 (1981):第 185-246 页。感谢伊森·加布里埃尔 (Iason Gabriel) 提供的此参考资料。

29. Arguments of this type have been made against policies designed to ensure equality of outcome, notably by the American legal philosopher Ronald Dworkin. See, for example, Ronald Dworkin, “What is equality? Part 1: Equality of welfare,” Philosophy and Public Affairs 10 (1981): 185–246. I am indebted to Iason Gabriel for this reference.

30.以报复性惩罚违法行为的形式表现出的恶意当然是一种普遍趋势。尽管它在使社区成员守规矩方面发挥着社会作用,但它可以被同样有效的威慑和预防政策所取代——即权衡惩罚违法者所造成的内在伤害与整个社会的利益。

30. Malice in the form of revenge-based punishment for transgressions is certainly a common tendency. Although it plays a social role in keeping members of a community in line, it can be replaced by an equally effective policy driven by deterrence and prevention—that is, weighing the intrinsic harm done when punishing the transgressor against the benefits to the larger society.

31.假设E ABP AB分别为爱丽丝的嫉妒系数和骄傲系数,并假设它们适用于幸福感的差异。那么爱丽丝效用的(有点过于简单)公式可以是以下

31. Let EAB and PAB be Alice’s coefficients of envy and pride respectively, and assume that they apply to the difference in well-being. Then a (somewhat oversimplified) formula for Alice’s utility could be the following:

U A = w A + C AB w BE AB ( w Bw A ) + PA B ( w Aw B )

UA = wA + CAB wBEAB (wBwA) + PAB (wAwB)

      = (1 + E AB + P AB ) w A + ( C AB - E AB - P AB ) w B

      = (1 + EAB + PAB) wA + (CABEABPAB) wB.

因此,如果爱丽丝的骄傲和嫉妒系数为正,它们对鲍勃福利的影响就和虐待和恶意系数完全一样:在其他条件相同的情况下,如果鲍勃的福利降低,爱丽丝会更快乐。事实上,骄傲和嫉妒通常不适用于福利的差异,而是适用于福利可见方面的差异,如地位和财产。鲍勃辛苦获得财产(这降低了他的整体幸福感)对 Alice 来说可能并不可见。这可能会导致自我挫败的行为,即“与琼斯一家攀比”。

Thus, if Alice has positive pride and envy coefficients, they act on Bob’s welfare exactly like sadism and malice coefficients: Alice is happier if Bob’s welfare is lowered, all other things being equal. In reality, pride and envy typically apply not to differences in well-being but to differences in visible aspects thereof, such as status and possessions. Bob’s hard toil in acquiring his possessions (which lowers his overall well-being) may not be visible to Alice. This can lead to the self-defeating behaviors that go under the heading of “keeping up with the Joneses.”

32.关于炫耀性消费的社会学:托斯丹·凡勃伦,《闲阶级论:一项制度的经济研究》(麦克米伦,1899年)。

32. On the sociology of conspicuous consumption: Thorstein Veblen, The Theory of the Leisure Class: An Economic Study of Institutions (Macmillan, 1899).

33. Fred Hirsch,《增长的社会极限》(Routledge & Kegan Paul,1977 年)

33. Fred Hirsch, The Social Limits to Growth (Routledge & Kegan Paul, 1977).

34.感谢 Ziyad Marar 向我介绍了社会认同理论及其在理解人类动机和行为方面的重要性。例如,请参阅 Dominic Abrams 和 Michael Hogg 编的《社会认同理论:建设性和批判性进展》 (Springer 1990 年)。有关主要思想的更简短摘要,请参阅 Ziyad Marar 的“社会认同”,摘自《这个想法很棒:每个人都应该知道的失落、被忽视和被低估的科学概念》,John Brockman 编(Harper Perennial,2018 年)。

34. I am indebted to Ziyad Marar for pointing me to social identity theory and its importance in understanding human motivation and behavior. See, for example, Dominic Abrams and Michael Hogg, eds., Social Identity Theory: Constructive and Critical Advances (Springer, 1990). For a much briefer summary of the main ideas, see Ziyad Marar, “Social identity,” in This Idea Is Brilliant: Lost, Overlooked, and Underappreciated Scientific Concepts Everyone Should Know, ed. John Brockman (Harper Perennial, 2018).

35在这里,我并不是说我们一定需要详细了解认知的神经实现;我们需要的是一个“软件”层面的模型,说明偏好(无论是显性偏好还是隐性偏好)如何产生行为。这样的模型需要结合已知的奖励系统知识。

35. Here, I am not suggesting that we necessarily need a detailed understanding of the neural implementation of cognition; what is needed is a model at the “software” level of how preferences, both explicit and implicit, generate behavior. Such a model would need to incorporate what is known about the reward system.

36. Ralph Adolphs和 David Anderson,《情绪的神经科学:一种新的综合》(普林斯顿大学出版社,2018 年)。

36. Ralph Adolphs and David Anderson, The Neuroscience of Emotion: A New Synthesis (Princeton University Press, 2018).

37.例如,请参阅 Rosalind Picard 的情感计算》第 2 版(麻省理工学院出版社,1998 年)。

37. See, for example, Rosalind Picard, Affective Computing, 2nd ed. (MIT Press, 1998).

38.热情洋溢地赞美榴莲的美味:阿尔弗雷德·拉塞尔·华莱士(Alfred Russel Wallace),《马来群岛:红毛猩猩和天堂鸟的土地》(麦克米伦,1869 年)。

38. Waxing lyrical on the delights of the durian: Alfred Russel Wallace, The Malay Archipelago: The Land of the Orang-Utan, and the Bird of Paradise (Macmillan, 1869).

39 .榴莲的气味并不那么乐观:艾伦·戴维森,《牛津食物指南》(牛津大学出版社,1999 年)。由于榴莲的气味太过浓烈,建筑物被疏散,飞机被迫中途返航。

39. A less rosy view of the durian: Alan Davidson, The Oxford Companion to Food (Oxford University Press, 1999). Buildings have been evacuated and planes turned around in mid-flight because of the durian’s overpowering odor.

40.写完本章后,我发现 Laurie Paul 在《变革性体验》 (牛津大学出版社,2014 年)一书中将榴莲用于完全相同的哲学目的。Paul认为,对自己偏好的不确定性会给决策理论带来致命问题,这一观点与 Richard Pettigrew 的观点相矛盾,他在《变革性体验和决策理论》一书中写道, 《哲学与现象学研究》第 91 卷(2015 年):第 766-74 页。两位作者都没有提到 Harsanyi 的早期作品《不完全信息游戏,第一至第三部分》,或 Cyert 和 de Groot 的《自适应效用》。

40. I discovered after writing this chapter that the durian was used for exactly the same philosophical purpose by Laurie Paul, Transformative Experience (Oxford University Press, 2014). Paul suggests that uncertainty about one’s own preferences presents fatal problems for decision theory, a view contradicted by Richard Pettigrew, “Transformative experience and decision theory,” Philosophy and Phenomenological Research 91 (2015): 766–74. Neither author refers to the early work of Harsanyi, “Games with incomplete information, Parts I–III,” or Cyert and de Groot, “Adaptive utility.”

41.关于帮助那些不了解自己偏好并正在了解自己的偏好的人的初步论文:Lawrence Chan 等人,《辅助多臂老虎机》,载于14 届 ACM/IEEE 人机交互 (HRI) 国际会议论文集,David Sirkin 等人编辑 (IEEE,2019 年)。

41. An initial paper on helping humans who don’t know their own preferences and are learning about them: Lawrence Chan et al., “The assistive multi-armed bandit,” in Proceedings of the 14th ACM/IEEE International Conference on Human–Robot Interaction (HRI), ed. David Sirkin et al. (IEEE, 2019).

42. Eliezer Yudkowsky 在《连贯外推意志》(奇点研究所,2004 年)一书中将所有这些方面以及明显的不一致性归结为混乱——不幸的是,这一术语尚未流行

42. Eliezer Yudkowsky, in Coherent Extrapolated Volition (Singularity Institute, 2004), lumps all these aspects, as well as plain inconsistency, under the heading of muddle—a term that has not, unfortunately, caught on.

43.论评价经历的两个自我:丹尼尔·卡尼曼,思考,快与慢》(Farrar, Straus & Giroux, 2011)。

43. On the two selves who evaluate experiences: Daniel Kahneman, Thinking, Fast and Slow (Farrar, Straus & Giroux, 2011).

44.埃奇沃思的快乐计,一种用来测量时时刻刻幸福感的虚构装置:弗朗西斯·埃奇沃思,《数学通灵术:论数学在道德科学中的应用》 (凯根·保罗,1881 年)。

44. Edgeworth’s hedonimeter, an imaginary device for measuring happiness moment to moment: Francis Edgeworth, Mathematical Psychics: An Essay on the Application of Mathematics to the Moral Sciences (Kegan Paul, 1881).

45.不确定性条件下序贯决策的标准文本:Martin Puterman,《马尔可夫决策过程离散随机动态规划》(Wiley,1994 年)。

45. A standard text on sequential decisions under uncertainty: Martin Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming (Wiley, 1994).

46.关于证明效用随时间变化的附加表示的公理假设:Tjalling Koopmans,《偏好排序随时间变化的表示》,《决策与组织》 ,C . Bartlett McGuire、Roy Radner 和 Kenneth Arrow 编辑(北荷兰出版社,1972 年)。

46. On axiomatic assumptions that justify additive representations of utility over time: Tjalling Koopmans, “Representation of preference orderings over time,” in Decision and Organization, ed. C. Bartlett McGuire, Roy Radner, and Kenneth Arrow (North-Holland, 1972).

47. 2019年的人类(他们可能在 2099 年早已死去,也可能只是 2099 年人类的早期自我)可能希望以尊重 2019 年人类的 2019 年偏好的方式制造机器,而不是迎合 2099 年人类无疑肤浅和考虑不周的偏好。这就像起草一部不允许任何修改的宪法。如果 2099 年的人类经过适当的审议后,决定他们希望推翻 2019 年人类的内在偏好,那么他们应该能够这样做似乎是合理的。毕竟,他们和他们的后代必须承担后果。

47. The 2019 humans (who might, in 2099, be long dead or might just be the earlier selves of 2099 humans) might wish to build the machines in a way that respects the 2019 preferences of the 2019 humans rather than pandering to the undoubtedly shallow and ill-considered preferences of humans in 2099. This would be like drawing up a constitution that disallows any amendments. If the 2099 humans, after suitable deliberation, decide they wish to override the preferences built in by the 2019 humans, it seems reasonable that they should be able to do so. After all, it is they and their descendants who have to live with the consequences.

48.我非常感谢温德尔·瓦拉赫的这一观察。

48. I am indebted to Wendell Wallach for this observation.

49.一篇早期的论文探讨了偏好随时间的变化:John Harsanyi,“不同偏好的福利经济学”,《经济研究评论》第 21 卷(1953 年):204-13 页。Franz Dietrich 和 Christian List 提供了一篇较新(且有些技术性)的调查,“偏好从何而来?”《国际博弈论杂志》第 42 卷(2013 年):第 613-37 页。另请参阅 Laurie Paul 的《转型体验》(牛津大学出版社,2014 年)和 Richard Pettigrew 的《为改变的自我而选择》, philpapers.org/archive /PETCFC.pdf 。

49. An early paper dealing with changes in preferences over time: John Harsanyi, “Welfare economics of variable tastes,” Review of Economic Studies 21 (1953): 204–13. A more recent (and somewhat technical) survey is provided by Franz Dietrich and Christian List, “Where do preferences come from?,” International Journal of Game Theory 42 (2013): 613–37. See also Laurie Paul, Transformative Experience (Oxford University Press, 2014), and Richard Pettigrew, “Choosing for Changing Selves,” philpapers.org/archive/PETCFC.pdf.

50.关于非理性的理性分析,参见乔恩·埃尔斯特著尤利西斯与塞壬:理性与非理性研究》(剑桥大学出版社,1979年)。

50. For a rational analysis of irrationality, see Jon Elster, Ulysses and the Sirens: Studies in Rationality and Irrationality (Cambridge University Press, 1979).

51.有关人类认知假肢的有前景的想法,请参阅 Falk Lieder,“超越有限理性:逆向工程和增强人类智能”(博士论文,加州大学伯克利分校,2018 年)。

51. For promising ideas on cognitive prostheses for humans, see Falk Lieder, “Beyond bounded rationality: Reverse-engineering and enhancing human intelligence” (PhD thesis, University of California, Berkeley, 2018).

第十章

CHAPTER 10

1.关于辅助游戏在驾驶中的应用:Dorsa Sadigh 等人,“规划与人协调的汽车”, Autonomous Robots 42(2018):1405–26。

1. On the application of assistance games to driving: Dorsa Sadigh et al., “Planning for cars that coordinate with people,” Autonomous Robots 42 (2018): 1405–26.

2.奇怪的是,苹果没有出现在这份名单上。它确实有一个人工智能研究小组,而且发展迅速。它传统的保密文化意味着,到目前为止,它在思想市场上的影响力相当有限

2. Apple is, curiously, absent from this list. It does have an AI research group and is ramping up rapidly. Its traditional culture of secrecy means that its impact in the marketplace of ideas is quite limited so far.

3.马克斯·泰格马克 (Max Tegmark),访谈《你相信这台电脑吗?》 ,由克里斯·潘恩 (Chris Paine) 执导,马克·门罗 (Mark Monroe) 编剧 (2018)。

3. Max Tegmark, interview, Do You Trust This Computer?, directed by Chris Paine, written by Mark Monroe (2018).

4.关于评估网络犯罪的影响:“网络犯罪造成 6000 亿美元损失,首当其冲的是银行”《安全杂志》,2018 年 2 月 21 日。

4. On estimating the impact of cybercrime: “Cybercrime cost $600 billion and targets banks first,” Security Magazine, February 21, 2018.

附录 A

APPENDIX A

1.未来 60 年国际象棋程序的基本规划:克劳德·香农,《编写计算机下棋程序》,《哲学杂志》,第 7 辑,第 41 期(1950 年):256-75。香农的提议借鉴了数百年来通过累加棋子值来评估国际象棋位置的传统;例如,参见 Pietro Carrera 的《Il gioco degli scacchi》(Giovanni de Rossi,1617 年)。

1. The basic plan for chess programs of the next sixty years: Claude Shannon, “Programming a computer for playing chess,” Philosophical Magazine, 7th ser., 41 (1950): 256–75. Shannon’s proposal drew on a centuries-long tradition of evaluating chess positions by adding up piece values; see, for example, Pietro Carrera, Il gioco degli scacchi (Giovanni de Rossi, 1617).

2.一份描述塞缪尔对跳棋早期强化学习算法的英勇研究的报告:Arthur Samuel,《使用跳棋游戏进行机器学习的一些研究》, IBM研究与开发杂志3(1959 年):210-29。

2. A report describing Samuel’s heroic research on an early reinforcement learning algorithm for checkers: Arthur Samuel, “Some studies in machine learning using the game of checkers,” IBM Journal of Research and Development 3 (1959): 210–29.

3.理性元推理的概念及其在搜索和博弈中的应用源自我的学生 Eric Wefald 的论文研究,他因车祸不幸去世,未能完成其研究成果;以下是他死后出版的作品:Stuart Russell 和 Eric Wefald 合著的《正确的事:有限理性研究》(麻省理工学院出版社,1991 年)。另请参阅 Eric Horvitz 的《在有限资源下优化决策的理性元推理和汇编》,载于《计算智能,II:国际研讨会论文集》, Francesco Gardin 和 Giancarlo Mauri 主编(北荷兰出版社,1990 年);以及 Stuart Russell 和 Eric Wefald 的《使用理性元推理进行最优博弈树搜索》,载于《第 11 届国际人工智能联合会议论文集》,Natesa Sridharan 主编(摩根考夫曼出版社,1989 年)。

3. The concept of rational metareasoning and its application to search and game playing emerged from the thesis research of my student Eric Wefald, who died tragically in a car accident before he could write up his work; the following appeared posthumously: Stuart Russell and Eric Wefald, Do the Right Thing: Studies in Limited Rationality (MIT Press, 1991). See also Eric Horvitz, “Rational metareasoning and compilation for optimizing decisions under bounded resources,” in Computational Intelligence, II: Proceedings of the International Symposium, ed. Francesco Gardin and Giancarlo Mauri (North-Holland, 1990); and Stuart Russell and Eric Wefald, “On optimal game-tree search using rational meta-reasoning,” in Proceedings of the 11th International Joint Conference on Artificial Intelligence, ed. Natesa Sridharan (Morgan Kaufmann, 1989).

4.或许这是第一篇展示层级组织如何降低规划组合复杂性的论文:Herbert Simon,《复杂性的架构》,《美国哲学学会会刊》 106(1962):467-482。

4. Perhaps the first paper showing how hierarchical organization reduces the combinatorial complexity of planning: Herbert Simon, “The architecture of complexity,” Proceedings of the American Philosophical Society 106 (1962): 467–82.

5.分层规划的典型参考文献是 Earl Sacerdoti 的《抽象空间层次规划》,《人工智能》第 5 卷(1974 年):115-135 页。另请参阅 Austin Tate 的《生成项目网络》,《第五届人工智能国际联合会议论文集》,Raj Reddy 主编(Morgan Kaufmann,1977 年)。

5. The canonical reference for hierarchical planning is Earl Sacerdoti, “Planning in a hierarchy of abstraction spaces,” Artificial Intelligence 5 (1974): 115–35. See also Austin Tate, “Generating project networks,” in Proceedings of the 5th International Joint Conference on Artificial Intelligence, ed. Raj Reddy (Morgan Kaufmann, 1977).

6.高级操作的正式定义:Bhaskara Marthi、Stuart Russell 和 Jason Wolfe,《高级操作的天使语义》,载于《第 17 届自动规划和调度国际会议论文集》 ,由 Mark Boddy、 Maria Fox 和 Sylvie Thiébaux 编辑(AAAI Press,2007 年)。

6. A formal definition of what high-level actions do: Bhaskara Marthi, Stuart Russell, and Jason Wolfe, “Angelic semantics for high-level actions,” in Proceedings of the 17th International Conference on Automated Planning and Scheduling, ed. Mark Boddy, Maria Fox, and Sylvie Thiébaux (AAAI Press, 2007).

附录 B

APPENDIX B

1.这个例子不太可能出自亚里士多德,而可能源自塞克斯都·恩披里柯 (Sextus Empiricus),他可能生活在公元二世纪或三世纪

1. This example is unlikely to be from Aristotle, but may have originated with Sextus Empiricus, who lived probably in the second or third century CE.

2.一阶逻辑中第一个定理证明算法是将一阶句子简化为(大量)命题句子:Martin Davis 和 Hilary Putnam,“量化理论的计算程序”, Journal of the ACM 7(1960):201-15。

2. The first algorithm for theorem-proving in first-order logic worked by reducing first-order sentences to (very large numbers of) propositional sentences: Martin Davis and Hilary Putnam, “A computing procedure for quantification theory,” Journal of the ACM 7 (1960): 201–15.

3.一种改进的命题推理算法: Martin Davis、George Logemann 和 Donald Loveland,《定理证明的机器程序》,《ACM 通讯》 5(1962 年):394-97。

3. An improved algorithm for propositional inference: Martin Davis, George Logemann, and Donald Loveland, “A machine program for theorem-proving,” Communications of the ACM 5 (1962): 394–97.

4.可满足性问题(决定一组句子在某个世界中是否真)是 NP 完全问题。推理问题(决定一个句子是否由已知句子得出)是同 NP 完全问题,这类问题被认为比 NP 完全问题更难。

4. The satisfiability problem—deciding whether a collection of sentences is true in some world—is NP-complete. The reasoning problem—deciding whether a sentence follows from the known sentences—is co-NP-complete, a class that is thought to be harder than NP-complete problems.

5.此规则有两个例外:不许重落(不许落子使棋盘回到先前的局面)和不许自杀(不许将棋子放在会立即被吃掉的位置 — — 例如,如果它已经被包围)

5. There are two exceptions to this rule: no repetition (a stone may not be played that returns the board to a situation that existed previously) and no suicide (a stone may not be placed such that it would immediately be captured—for example, if it is already surrounded).

6.介绍我们今天所理解的一阶逻辑的著作( Begriffsschrift意为“概念写作”):Gottlob Frege, Begriffsschrift,eine der arithmetischen nachgebildete Formelsprache des reinen Denkens (Halle, 1879)。弗雷格的一阶逻辑符号非常怪异和笨拙,因此很快就被 Giuseppe Peano 引入的符号所取代,该符号至今仍在广泛使用。

6. The work that introduced first-order logic as we understand it today (Begriffsschrift means “concept writing”): Gottlob Frege, Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens (Halle, 1879). Frege’s notation for first-order logic was so bizarre and unwieldy that it was soon replaced by the notation introduced by Giuseppe Peano, which remains in common use today.

7.日本通过知识型系统争夺霸权的总结:Edward Feigenbaum 和 Pamela McCorduck,《第五代:人工智能和日本对世界的计算机挑战》 Addison-Wesley,1983 年)。

7. A summary of Japan’s bid for supremacy through knowledge-based systems: Edward Feigenbaum and Pamela McCorduck, The Fifth Generation: Artificial Intelligence and Japan’s Computer Challenge to the World (Addison-Wesley, 1983).

8.美国的努力包括战略计算计划和微电子与计算机技术公司 (MCC) 的成立。请参阅 Alex Roland 和 Philip Shiman 的《战略计算:DARPA 和机器智能探索,1983-1993》麻省理工学院出版社,2002 年)。

8. The US efforts included the Strategic Computing Initiative and the formation of the Microelectronics and Computer Technology Corporation (MCC). See Alex Roland and Philip Shiman, Strategic Computing: DARPA and the Quest for Machine Intelligence, 1983–1993 (MIT Press, 2002).

9. 20 世纪 80 年代英国对人工智能重新出现的响应历史:Brian Oakley 和 Kenneth Owen,《Alvey:英国战略计算计划》(麻省理工学院出版社,1990 年)。

9. A history of Britain’s response to the re-emergence of AI in the 1980s: Brian Oakley and Kenneth Owen, Alvey: Britain’s Strategic Computing Initiative (MIT Press, 1990).

10. GOFAI一词起源:John Haugeland,《人工智能:真正的想法》(麻省理工学院出版社,1985 年)。

10. The origin of the term GOFAI: John Haugeland, Artificial Intelligence: The Very Idea (MIT Press, 1985).

11.采访 Demis Hassabis 谈人工智能和深度学习的未来:Nick Heath,《Google DeepMind 创始人 Demis Hassabis:关于人工智能的三个真相》, TechRepublic 2018 年 9 月 24 日。

11. Interview with Demis Hassabis on the future of AI and deep learning: Nick Heath, “Google DeepMind founder Demis Hassabis: Three truths about AI,” TechRepublic, September 24, 2018.

附录 C

APPENDIX C

1.Pearl的工作于 2011 年获得了图灵奖

1. Pearl’s work was recognized by the Turing Award in 2011.

2.更详细地介绍贝叶斯网络:网络中的每个节点都标注了每个可能值的概率,给定节点父节点(即指向它的节点)的每个可能值组合。例如,当D 1D 2具有相同值时, Doubles 12的值为真的概率为 1.0,否则为 0.0。可能世界是所有节点的值的分配。这种世界的概率是每个节点的适当概率的乘积。

2. Bayes nets in more detail: Every node in the network is annotated with the probability of each possible value, given each possible combination of values for the node’s parents (that is, those nodes that point to it). For example, the probability that Doubles12 has value true is 1.0 when D1 and D2 have the same value, and 0.0 otherwise. A possible world is an assignment of values to all the nodes. The probability of such a world is the product of the appropriate probabilities from each of the nodes.

3.贝叶斯网络应用概要:Olivier Pourret、Patrick Naïm 和 Bruce Marcot 编辑,《贝叶斯网络:应用实用指南》(Wiley,2008 年)。

3. A compendium of applications of Bayes nets: Olivier Pourret, Patrick Naïm, and Bruce Marcot, eds., Bayesian Networks: A Practical Guide to Applications (Wiley, 2008).

4.概率编程基础论文:Daphne Koller、David McAllester 和 Avi Pfeffer,《随机程序的有效贝叶斯推理》,载于14 届全国人工智能大会论文集(AAAI Press,1997 年)。有关更多参考资料,请参阅 probabilistic-programming.org。

4. The basic paper on probabilistic programming: Daphne Koller, David McAllester, and Avi Pfeffer, “Effective Bayesian inference for stochastic programs,” in Proceedings of the 14th National Conference on Artificial Intelligence (AAAI Press, 1997). For many additional references, see probabilistic-programming.org.

5.使用概率程序模拟人类概念学习:Brenden Lake、Ruslan Salakhutdinov 和 Joshua Tenenbaum,“通过概率程序诱导进行人类层面的概念学习”, Science 350(2015):1332–38。

5. Using probabilistic programs to model human concept learning: Brenden Lake, Ruslan Salakhutdinov, and Joshua Tenenbaum, “Human-level concept learning through probabilistic program induction,” Science 350 (2015): 1332–38.

6.有关地震监测应用和相关概率模型的详细描述,请参阅 Nimar Arora、Stuart Russell 和 Erik Sudderth 的《NET-VISA:网络处理垂直集成地震分析》,《美国地震学会公报》 103(2013 年):709-29。

6. For a detailed description of the seismic monitoring application and associated probability model, see Nimar Arora, Stuart Russell, and Erik Sudderth, “NET-VISA: Network processing vertically integrated seismic analysis,” Bulletin of the Seismological Society of America 103 (2013): 709–29.

7.描述首批严重自动驾驶汽车事故之一的新闻文章:Ryan Randazzo,《谁是 Uber 自动驾驶汽车事故的责任人?坦佩警方报告的说法不一》,《共和报》(azcentral.com),2017 年 3 月 29 日。

7. News article describing one of the first serious self-driving car crashes: Ryan Randazzo, “Who was at fault in self-driving Uber crash? Accounts in Tempe police report disagree,” Republic (azcentral.com), March 29, 2017.

附录 D

APPENDIX D

1.归纳学习的奠基性讨论:David Hume,《人类理解的哲学论文》 A.Millar,1748)。

1. The foundational discussion of inductive learning: David Hume, Philosophical Essays Concerning Human Understanding (A. Millar, 1748).

2. Leslie Valiant,《可学习理论》,《 ACM 通讯》第 27 卷(1984 年):1134–42 页。另请参阅 Vladimir Vapnik 的《统计学习理论》 Wiley,1998 年)。Valiant 的方法侧重于计算复杂性,Vapnik 的方法侧重于对各类假设的学习能力进行统计分析,但两者均具有一个共同的理论核心,即数据与预测准确性之间的联系。

2. Leslie Valiant, “A theory of the learnable,” Communications of the ACM 27 (1984): 1134–42. See also Vladimir Vapnik, Statistical Learning Theory (Wiley, 1998). Valiant’s approach concentrated on computational complexity, Vapnik’s on statistical analysis of the learning capacity of various classes of hypotheses, but both shared a common theoretical core connecting data and predictive accuracy.

3.例如,要学习“情境超级劫”和“自然情境超级劫”规则之间的区别,学习算法必须尝试重复之前通过弃牌而不是通过下子创造的棋盘位置。结果在不同国家会有所不同。

3. For example, to learn the difference between the “situational superko” and “natural situational superko” rules, the learning algorithm would have to try repeating a board position that it had created previously by a pass rather than by playing a stone. The results would be different in different countries.

4.有关 ImageNet 竞赛的描述,请参阅 Olga Russakovsky 等人的《ImageNet 大规模视觉识别挑战赛》,《国际计算机视觉杂志》 115(2015 年):211-52。

4. For a description of the ImageNet competition, see Olga Russakovsky et al., “ImageNet large scale visual recognition challenge,” International Journal of Computer Vision 115 (2015): 211–52.

5.深度网络在视觉领域的首次演示:Alex Krizhevsky、Ilya Sutskever 和 Geoffrey Hinton,《使用深度卷积神经网络进行ImageNet 分类》,载于《神经信息处理系统进展》25,Fernando Pereira 等人编辑(2012 年)。

5. The first demonstration of deep networks for vision: Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, “ImageNet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems 25, ed. Fernando Pereira et al. (2012).

6.区分一百多个品种的狗的难度:Andrej Karpathy,“我在 ImageNet 上与 ConvNet 竞争中学到了什么”, Andrej Karpathy 博客,2014 年 9 月 2 日。

6. The difficulty of distinguishing over one hundred breeds of dogs: Andrej Karpathy, “What I learned from competing against a ConvNet on ImageNet,” Andrej Karpathy Blog, September 2, 2014.

7. Google 关于 Inceptionism 研究的博客文章:Alexander Mordvintsev、Christopher Olah 和 Mike Tyka,《Inceptionism:深入研究神经网络》, Google AI 博客,2015 年 6 月 17 日该想法似乎源自 JP Lewis,《通过改进创造:梯度下降学习网络的创造力范式》,载于《IEEE 国际神经网络会议论文集》(IEEE,1988 年)。

7. Blog post on inceptionism research at Google: Alexander Mordvintsev, Christopher Olah, and Mike Tyka, “Inceptionism: Going deeper into neural networks,” Google AI Blog, June 17, 2015. The idea seems to have originated with J. P. Lewis, “Creation by refinement: A creativity paradigm for gradient descent learning networks,” in Proceedings of the IEEE International Conference on Neural Networks (IEEE, 1988).

8.关于 Geoff Hinton 对深度网络产生疑虑的新闻文章:Steve LeVine,《人工智能先驱称我们需要重新开始》, Axios 2017 年 9 月 15 日。

8. News article on Geoff Hinton having second thoughts about deep networks: Steve LeVine, “Artificial intelligence pioneer says we need to start over,” Axios, September 15, 2017.

9.深度学习的缺点目录:Gary Marcus,“深度学习:批判性评价”,arXiv:1801.00631(2018)。

9. A catalog of shortcomings of deep learning: Gary Marcus, “Deep learning: A critical appraisal,” arXiv:1801.00631 (2018).

10.一本关于深度学习的热门教科书,坦率地评估了其弱点:François Chollet, 《使用 Python进行深度学习》(Manning Publications,2017 年)。

10. A popular textbook on deep learning, with a frank assessment of its weaknesses: François Chollet, Deep Learning with Python (Manning Publications, 2017).

11.基于解释的学习的解释:Thomas Dietterich,“知识层面的学习”,机器学习1(1986):287-315。

11. An explanation of explanation-based learning: Thomas Dietterich, “Learning at the knowledge level,” Machine Learning 1 (1986): 287–315.

12.对基于解释的学习的一种表面上完全不同的解释:John Laird、Paul Rosenbloom 和 Allen Newell,《Soar 中的分块:一般学习机制的剖析》,《机器学习》 1(1986 年):11-46。

12. A superficially quite different explanation of explanation-based learning: John Laird, Paul Rosenbloom, and Allen Newell, “Chunking in Soar: The anatomy of a general learning mechanism,” Machine Learning 1 (1986): 11–46.

图片来源

Image Credits

图 2:(b) © The Sun / News Licensing;(c) 史密森学会档案馆提供。

Figure 2: (b) © The Sun / News Licensing; (c) Courtesy of Smithsonian Institution Archives.

图 4 :© SRI International。creativecommons.org/licenses/by/3.0/legalcode 。

Figure 4: © SRI International. creativecommons.org/licenses/by/3.0/legalcode.

图 5:(左)© 伯克利人工智能研究实验室;(右)© 波士顿动力公司。

Figure 5: (left) © Berkeley AI Research Lab; (right) © Boston Dynamics.

图 6:© 索尔斯坦伯格基金会/艺术家权利协会 (ARS),纽约。

Figure 6: © The Saul Steinberg Foundation / Artists Rights Society (ARS), New York.

图 7:(左)© Noam Eshel,Defense Update;(右)© 未来生命研究所 / Stuart Russell。

Figure 7: (left) © Noam Eshel, Defense Update; (right) © Future of Life Institute / Stuart Russell.

图 10:(左)© AFP;(右)由 Henrik Sorensen 提供。

Figure 10: (left) © AFP; (right) Courtesy of Henrik Sorensen.

图 11Elysium © 2013 MRC II Distribution Company LP 保留所有权利。由哥伦比亚电影公司提供。

Figure 11: Elysium © 2013 MRC II Distribution Company L.P. All Rights Reserved. Courtesy of Columbia Pictures.

图 14:© OpenStreetMap 贡献者。OpenStreetMap.org。creativecommons.org/licenses/by/2.0/ legalcode

Figure 14: © OpenStreetMap contributors. OpenStreetMap.org. creativecommons.org/licenses/by/2.0/legalcode.

图 19:地形照片:DigitalGlobe via Getty Images。

Figure 19: Terrain photo: DigitalGlobe via Getty Images.

图 20:(右)由坦佩警察局提供。

Figure 20: (right) Courtesy of the Tempe Police Department.

图 24 © Jessica Mullen / Deep Dreamscope。creativecommons.org/licenses/by/2.0/legalcode 。

Figure 24: © Jessica Mullen / Deep Dreamscope. creativecommons.org/licenses/by/2.0/legalcode.

指数

Index

本索引中的页码指的是本书的印刷版。提供的链接将带您到该印刷页的开头。您可能需要从该位置向前滚动才能在电子阅读器上找到相应的参考资料。

The page numbers in this index refer to the printed version of this book. The link provided will take you to the beginning of that print page. You may need to scroll forward from that location to find the corresponding reference on your e-reader.

AAAI(人工智能促进协会),250

AAAI (Association for the Advancement of Artificial Intelligence), 250

阿比尔,彼得73,192

Abbeel, Pieter, 73, 192

抽象动作,层次结构,87 – 90

abstract actions, hierarchy of, 87–90

摘要规划,264-66

abstract planning, 264–66

智能个人助理的访问缺陷,67-68

access shortcomings, of intelligent personal assistants, 67–68

动作电位,15

action potentials, 15

行动,发现,87 –90

actions, discovering, 87–90

执行器,72

actuators, 72

艾达,洛夫莱斯伯爵夫人。参见 艾达·洛夫莱斯

Ada, Countess of Lovelace. See Lovelace, Ada

适应性生物,18-19

adaptive organisms, 18–19

代理。请参阅 智能代理

agent. See intelligent agent

代理程序, 48

agent program, 48

“人工智能研究人员论人工智能风险” (亚历山大),153

“AI Researchers on AI Risk” (Alexander), 153

阿尔西内·杰基(Jacky Alciné),60 岁

Alciné, Jacky, 60

亚历山大·斯科特,146、153、169-70

Alexander, Scott, 146, 153, 169–70

算法,33 –34

algorithms, 33–34

贝叶斯网络和275 –77

Bayesian networks and, 275–77

贝叶斯更新283,284

Bayesian updating, 283, 284

偏见和,128 –30

bias and, 128–30

下棋,62 –63

chess-playing, 62–63

编码, 34

coding of, 34

完备性定理和51 –52

completeness theorem and, 51–52

计算机硬件和34 – 35

computer hardware and, 34–35

内容选择8 –9,105

content selection, 8–9, 105

深度学习,58-59,288-93

deep learning, 58–59, 288–93

动态规划,54-55

dynamic programming, 54–55

常见例子,33 –34

examples of common, 33–34

问题的指数复杂性,38-39

exponential complexity of problems and, 38–39

停机问题,37 –38

halting problem and, 37–38

前瞻搜索, 47 , 49 –50, 260 –61

lookahead search, 47, 49–50, 260–61

命题逻辑和268 –70

propositional logic and, 268–70

强化学习,55-57,105

reinforcement learning, 55–57, 105

子程序内,34

subroutines within, 34

监督学习,58-59,285-93

supervised learning, 58–59, 285–93

阿里巴巴,250

Alibaba, 250

AlphaGo ,6、46-48、49-50、55、91、92、206-7、209-10、261、265、285

AlphaGo, 6, 46–48, 49–50, 55, 91, 92, 206–7, 209–10, 261, 265, 285

AlphaZero 47,48

AlphaZero, 47, 48

利他主义24,227-29

altruism, 24, 227–29

利他主义人工智能,173–75

altruistic AI, 173–75

亚马逊106,119,250

Amazon, 106, 119, 250

回声,64 – 65

Echo, 64–65

“拾取挑战”加速机器人发展,73-74

“Picking Challenge” to accelerate robot development, 73–74

分析引擎,40

Analytical Engine, 40

蚂蚁,25

ants, 25

奥恩·约瑟夫,123

Aoun, Joseph, 123

苹果 HomePod ,64-65

Apple HomePod, 64–65

“复杂性建筑” (Simon),265

“Architecture of Complexity, The” (Simon), 265

亚里士多德,20-21、39-40、50、52、53、114、245

Aristotle, 20–21, 39–40, 50, 52, 53, 114, 245

阿姆斯特朗,斯图尔特,221

Armstrong, Stuart, 221

阿诺德,安托万,21-22

Arnauld, Antoine, 21–22

Arrow,肯尼斯,223

Arrow, Kenneth, 223

人工智能(AI),1-12

artificial intelligence (AI), 1–12

代理(参见 智能代理

agent (See intelligent agent)

代理程序,48 –59

agent programs, 48–59

有益的,原则(参见 有益的 AI

beneficial, principles for (See beneficial AI)

对人类的益处,98 –102

benefits to humans of, 98–102

作为人类历史上最大的事件,1-4

as biggest event in human history, 1–4

超级智能 AI所需的概念突破(参见 超级智能 AI 所需的概念突破

conceptual breakthroughs required for (See conceptual breakthroughs required for superintelligent AI)

全球范围内的决策能力,75-76

decision making on global scale, capability for, 75–76

深度学习,6

deep learning and, 6

家用机器人,73-74

domestic robots and, 73–74

通用, 46 –48, 100 , 136

general-purpose, 46–48, 100, 136

全球规模,感知和决策能力,74-76

global scale, capability to sense and make decisions on, 74–76

进球数,41 –42, 48 –53, 136 –42, 165 –69

goals and, 41–42, 48–53, 136–42, 165–69

治理,249-53

governance of, 249–53

健康进步和101

health advances and, 101

历史, 4 –6, 40 –42

history of, 4–6, 40–42

人类偏好和(参见 人类偏好

human preferences and (See human preferences)

想象超级智能机器能做什么,93-96

imagining what superintelligent machines could do, 93–96

智力,定义,39-61

intelligence, defining, 39–61

智能个人助理,67-71

intelligent personal assistants and, 67–71

超级智能的局限性,96-98

limits of superintelligence, 96–98

生活水平提高,98 –100

living standard increases and, 98–100

逻辑和,39 –40

logic and, 39–40

媒体和公众对进步的看法,62-64

media and public perception of advances in, 62–64

滥用(参见 人工智能的滥用

misuses of (See misuses of AI)

移动电话和64 – 65

mobile phones and, 64–65

乘数效应,99

multiplier effect of, 99

目标和,11 –12,43,48 –61,136 –42,165 –69​

objectives and, 11–12, 43, 48–61, 136–42, 165–69

过度智能的人工智能,132-44

overly intelligent AI, 132–44

创造科学进步的速度,6-9

pace of scientific progress in creating, 6–9

预测超级人工智能的到来,76-78

predicting arrival of superintelligent AI, 76–78

阅读能力,74-75

reading capabilities and, 74–75

人工智能带来的风险( 人工智能带来的风险

risk posed by (See risk posed by AI)

规模和,94 –96

scale and, 94–96

扩大感官输入和行动能力,94-95

scaling up sensory inputs and capacity for action, 94–95

自动驾驶汽车,65 –67, 181 –82, 247

self-driving cars and, 65–67, 181–82, 247

全球规模感知能力,75

sensing on global scale, capability to, 75

智能家居和71 – 72

smart homes and, 71–72

软机器人和64

softbots and, 64

语音识别功能,以及74 –75

speech recognition capabilities and, 74–75

标准模型,9 – 11, 13 , 48 – 61, 247

standard model of, 9–11, 13, 48–61, 247

图灵测试和40 –41

Turing test and, 40–41

辅导,100 –101

tutoring by, 100–101

虚拟现实创作者,101

virtual reality authoring by, 101

万维网和64

World Wide Web and, 64

“人工智能与2030年的生活” (人工智能百年研究)149,150

“Artificial Intelligence and Life in 2030” (One Hundred Year Study on Artificial Intelligence), 149, 150

阿西莫夫,艾萨克,141

Asimov, Isaac, 141

辅助游戏,192 –203

assistance games, 192–203

从长远来看,学习偏好是200 –202

learning preferences exactly in long run, 200–202

关闭游戏,196 –200

off-switch game, 196–200

回形针游戏,194-96

paperclip game, 194–96

禁令,以及202 –3

prohibitions and, 202–3

人类目标的不确定性,200-202

uncertainty about human objectives, 200–202

人工智能促进会(AAAI),250

Association for the Advancement of Artificial Intelligence (AAAI), 250

假设失败,186 –87

assumption failure, 186–87

阿特金森,罗伯特,158

Atkinson, Robert, 158

Atlas人形机器人,73

Atlas humanoid robot, 73

自主武器系统(LAWS),110-13

autonomous weapons systems (LAWS), 110–13

自主性丧失问题,255 –56

autonomy loss problem, 255–56

奥托,大卫,116

Autor, David, 116

复仇者联盟:无限战争(电影),224

Avengers: Infinity War (film), 224

“避免纳入人类目标”论点,165-69

“avoid putting in human goals” argument, 165–69

效用理论的公理基础,23-24

axiomatic basis for utility theory, 23–24

公理,185

axioms, 185

查尔斯·巴贝奇,40132-33

Babbage, Charles, 40, 132–33

双陆棋,55

backgammon, 55

百度,250

Baidu, 250

鲍德温·詹姆斯18 岁

Baldwin, James, 18

鲍德温效应,18-20

Baldwin effect, 18–20

伊恩·班克斯164

Banks, Iain, 164

银行出纳员,117-18

bank tellers, 117–18

贝叶斯,托马斯,54岁

Bayes, Thomas, 54

贝叶斯逻辑,54

Bayesian logic, 54

贝叶斯网络54,275-77

Bayesian networks, 54, 275–77

贝叶斯理性,54

Bayesian rationality, 54

贝叶斯更新283,284

Bayesian updating, 283, 284

贝叶斯定理,54

Bayes theorem, 54

行为、学习偏好,190 –92

behavior, learning preferences from, 190–92

行为矫正,104-7

behavior modification, 104–7

信仰状态,282-83

belief state, 282–83

有益的人工智能,171 –210, 247 –49

beneficial AI, 171–210, 247–49

对发展的谨慎,原因,179

caution regarding development of, reasons for, 179

可用于了解人类偏好的数据,180 –81

data available for learning about human preferences, 180–81

经济激励,179 –80

economic incentives for, 179–80

邪恶行为,179

evil behavior and, 179

学习预测人类的偏好,176-77

learning to predict human preferences, 176–77

道德困境,178

moral dilemmas and, 178

人工智能的目标是最大限度地实现人类的偏好,173–75

objective of AI is to maximize realization of human preferences, 173–75

原则,172 –79

principles for, 172–79

证明(参见 有益 AI 的证明

proofs for (See proofs for beneficial AI)

人类偏好的不确定性,175-76

uncertainty as to what human preferences are, 175–76

价值观,定义,177 –78

values, defining, 177–78

Bentham,Jeremy24,219

Bentham, Jeremy, 24, 219

保罗·伯格,182

Berg, Paul, 182

伯克利机器人消除繁琐任务(BRETT),73

Berkeley Robot for the Elimination of Tedious Tasks (BRETT), 73

丹尼尔·伯努利,22-23

Bernoulli, Daniel, 22–23

“比尔·盖茨害怕人工智能,但人工智能研究人员更了解人工智能” (《大众科学》), 152

“Bill Gates Fears AI, but AI Researchers Know Better” (Popular Science), 152

敲诈勒索,104-5

blackmail, 104–5

眨眼反射,57

blinking reflex, 57

区块链,161

blockchain, 161

棋盘游戏,45

board games, 45

乔治·布尔,268

Boole, George, 268

布尔(命题)逻辑,51,268-70

Boolean (propositional) logic, 51, 268–70

引导过程,81 –82

bootstrapping process, 81–82

波士顿动力公司,73岁

Boston Dynamics, 73

博斯特罗姆,尼克,102 , 144 , 145 , 150 , 166 , 167 , 183 , 253

Bostrom, Nick, 102, 144, 145, 150, 166, 167, 183, 253

大脑,16、17-18

brains, 16, 17–18

奖励制度,17-18

reward system and, 17–18

顶峰机器,相比之下,34

Summit machine, compared, 34

BRETT(伯克利机器人消除繁琐任务),73

BRETT (Berkeley Robot for the Elimination of Tedious Tasks), 73

布林,谢尔盖,81岁

Brin, Sergey, 81

布鲁克斯,罗德尼,168

Brooks, Rodney, 168

埃里克·布林约尔松(Erik Brynjolfsson)117

Brynjolfsson, Erik, 117

《布达佩斯网络犯罪公约》,253-54

Budapest Convention on Cybercrime, 253–54

巴特勒,塞缪尔133 –34,159

Butler, Samuel, 133–34, 159

“难道我们不能……”回应人工智能带来的风险,160-69

“can’t we just . . .” responses to risks posed by AI, 160–69

“……避免把人类目标纳入考量” ,165-69

“. . . avoid putting in human goals,” 165–69

“...与机器融合” ,163-65

“. . . merge with machines,” 163–65

“……把它放进盒子里” ,161-63

“. . . put it in a box,” 161–63

“...关掉它” ,160-61

“. . . switch it off,” 160–61

“...在人机团队中工作”,163

“. . . work in human-machine teams,” 163

卡尔达诺,杰罗拉莫,21岁

Cardano, Gerolamo, 21

护理专业,122

caring professions, 122

Chace,Calum,113

Chace, Calum, 113

人类偏好随时间的变化,240-45

changes in human preferences over time, 240–45

更衣室(小屋),121

Changing Places (Lodge), 121

跳棋程序,55,261

checkers program, 55, 261

国际象棋程序,62-63

chess programs, 62–63

Chollet,François,293

Chollet, François, 293

分块,295

chunking, 295

电路,291 – 92

circuits, 291–92

美国有线电视新闻网,108

CNN, 108

CODE(拒绝环境下的协同操作),112

CODE (Collaborative Operations in Denied Environments), 112

组合复杂性,258

combinatorial complexity, 258

通用作战图,69

common operational picture, 69

补偿效应,114-17

compensation effects, 114–17

完备性定理(哥德尔定理),51 –52

completeness theorem (Gödel’s), 51–52

问题的复杂性,38-39

complexity of problems, 38–39

全面禁止核试验条约(CTBT)地震监测,279-80

Comprehensive Nuclear-Test-Ban Treaty (CTBT) seismic monitoring, 279–80

计算机编程,119

computer programming, 119

电脑,32 – 61

computers, 32–61

算法和(参见 算法

algorithms and (See algorithms)

问题的复杂性,38-39

complexity of problems and, 38–39

停机问题,37 –38

halting problem and, 37–38

硬件,34 –35

hardware, 34–35

智能(参见 人工智能)

intelligent (See artificial intelligence)

计算极限,36-39

limits of computation, 36–39

软件限制,37

software limitations, 37

专用设备,建筑,35 –36

special-purpose devices, building, 35–36

普遍性,32

universality and, 32

计算机科学,33

computer science, 33

“计算机器与智能”(图灵40-41,149

“Computing Machinery and Intelligence” (Turing), 40–41, 149

超级智能 AI 所需的概念突破,78-93

conceptual breakthroughs required for superintelligent AI, 78–93

行动,发现,87 –90

actions, discovering, 87–90

概念和理论的累积学习,82-87

cumulative learning of concepts and theories, 82–87

语言/常识问题,79 –82

language/common sense problem, 79–82

心理活动,管理,90-92

mental activity, managing, 90–92

意识,16-17

consciousness, 16–17

结果主义,217-19

consequentialism, 217–19

内容选择算法,8-9,105

content selection algorithms, 8–9, 105

智能个人助理的内容缺陷,67-68

content shortcomings, of intelligent personal assistants, 67–68

控制理论10,44-45,54,176

control theory, 10, 44–45, 54, 176

卷积神经网络,47

convolutional neural networks, 47

评估解决方案和目标的成本函数,48

cost function to evaluate solutions, and goals, 48

信誉联盟,109

Credibility Coalition, 109

CRISPR-Cas9, 156

CRISPR-Cas9, 156

概念和理论的累积学习,82-87

cumulative learning of concepts and theories, 82–87

网络安全,186-87

cybersecurity, 186–87

《每日电讯报》 77

Daily Telegraph, 77

全球范围内的决策,75-76

decision making on global scale, 75–76

退相干,36

decoherence, 36

深蓝62,261

Deep Blue, 62, 261

深度卷积网络,288-90

deep convolutional network, 288–90

深梦图像,291

deep dreaming images, 291

深度伪造,105-6

deepfakes, 105–6

深度学习,6,58-59,86-87,288-93

deep learning, 6, 58–59, 86–87, 288–93

DeepMind,90岁

DeepMind, 90

AlphaGo ,6、46-48、49-50、55、91、92、206-7、209-10、261、265、285

AlphaGo, 6, 46–48, 49–50, 55, 91, 92, 206–7, 209–10, 261, 265, 285

AlphaZero 47,48

AlphaZero, 47, 48

DQN 系统,55 –56

DQN system, 55–56

偏转论证,154-59

deflection arguments, 154–59

“研究无法控制”论点,154-56

“research can’t be controlled” arguments, 154–56

对人工智能风险的沉默,158-59

silence regarding risks of AI, 158–59

部落主义,150,159-60

tribalism, 150, 159–60

什么事,156-57

whataboutery, 156–57

Delilah(勒索机器人),105

Delilah (blackmail bot), 105

否认人工智能带来的风险,146-54

denial of risk posed by AI, 146–54

“这很复杂”论点,147 –48

“it’s complicated” argument, 147–48

“这是不可能的”论点,149 –50

“it’s impossible” argument, 149–50

“现在担心这个还太早”论点,150 –52

“it’s too soon to worry about it” argument, 150–52

卢德主义指控,153 –54

Luddism accusation and, 153–54

“我们是专家”的论调,152 –54

“we’re the experts” argument, 152–54

义务论伦理学,217

deontological ethics, 217

灵巧性问题,机器人,73 –74

dexterity problem, robots, 73–74

迪金森,迈克尔,190

Dickinson, Michael, 190

迪克曼斯,恩斯特,65岁

Dickmanns, Ernst, 65

DigitalGlobe,75岁

DigitalGlobe, 75

家用机器人,73-74

domestic robots, 73–74

多巴胺,17,205-6

dopamine, 17, 205–6

Dota 2,56

Dota 2, 56

DQN 系统,55 –56

DQN system, 55–56

沙丘(赫伯特),135

Dune (Herbert), 135

动态规划算法,54-55

dynamic programming algorithms, 54–55

大肠杆菌14-15

E. coli, 14–15

eBay,106

eBay, 106

ECHO(首款智能家居),71

ECHO (first smart home), 71

“我们孙辈的经济前景”(凯恩斯),第 113-14 页第 120-21

“Economic Possibilities for Our Grandchildren” (Keynes), 113–14, 120–21

经济奇点:人工智能与资本主义的消亡(Chace),113

The Economic Singularity: Artificial Intelligence and the Death of Capitalism (Chace), 113

经济学家, 145

Economist, The, 145

埃奇沃思,弗朗西斯,238

Edgeworth, Francis, 238

艾森豪威尔,德怀特,249

Eisenhower, Dwight, 249

电动作电位,15

electrical action potentials, 15

Eliza(第一个聊天机器人),67 岁

Eliza (first chatbot), 67

Elmo(将棋程序),47

Elmo (shogi program), 47

乔恩·埃尔斯特242

Elster, Jon, 242

极乐世界(电影),127

Elysium (film), 127

紧急制动,57

emergency braking, 57

人类衰弱问题,254-55

enfeeblement of humans problem, 254–55

嫉妒,229-31

envy, 229–31

伊壁鸠鲁,219

Epicurus, 219

平衡解,3031,195 – 96

equilibrium solutions, 30–31, 195–96

埃瑞璜(巴特勒)133 –34,159

Erewhon (Butler), 133–34, 159

Etzioni,Oren152,157

Etzioni, Oren, 152, 157

优生运动,155-56

eugenics movement, 155–56

预期价值规则,22 –23

expected value rule, 22–23

经验、学习,285 –95

experience, learning from, 285–95

体验自我和偏好,238-40

experiencing self, and preferences, 238–40

基于解释的学习,294-95

explanation-based learning, 294–95

Facebook108,250

Facebook, 108, 250

事实、虚构和预测(古德曼),85

Fact, Fiction and Forecast (Goodman), 85

事实核查108-9,110

fact-checking, 108–9, 110

factcheck.org,108

factcheck.org, 108

对死亡的恐惧(作为工具性目标),140-42

fear of death (as an instrumental goal), 140–42

特征工程,84 –85

feature engineering, 84–85

费马,皮埃尔·德,185

Fermat, Pierre de, 185

费马最后定理,185

Fermat’s Last Theorem, 185

费兰蒂马克一世,34

Ferranti Mark I, 34

第五代项目,271

Fifth Generation project, 271

防火墙人工智能系统,161-63

firewalling AI systems, 161–63

一阶逻辑51,270-72

first-order logic, 51, 270–72

概率语言和277 –80

probabilistic languages and, 277–80

命题逻辑区分,270

propositional logic distinguished, 270

福特,马丁,113

Ford, Martin, 113

福斯特,EM ,254-55

Forster, E. M., 254–55

福克斯新闻,108

Fox News, 108

弗雷格,戈特洛布,270

Frege, Gottlob, 270

富尔,鲍勃,190

Full, Bob, 190

G7, 250 –51

G7, 250–51

伽利略·伽利莱,85-86

Galileo Galilei, 85–86

赌博,21-23

gambling, 21–23

博弈论,28-32另见 协助游戏

game theory, 28–32. See also assistance games

盖茨,比尔56,153

Gates, Bill, 56, 153

GDPR(通用数据保护条例),127-29

GDPR (General Data Protection Regulation), 127–29

Geminoid DK(机器人),125

Geminoid DK (robot), 125

通用数据保护条例(GDPR),127-29

General Data Protection Regulation (GDPR), 127–29

通用人工智能46-48,100,136

general-purpose artificial intelligence, 46–48, 100, 136

几何对象, 33

geometric objects, 33

魅力, 129

Glamour, 129

全球学习XPRIZE竞赛,70

Global Learning XPRIZE competition, 70

围棋646 –47,49 –50,5155 56​

Go, 6, 46–47, 49–50, 51, 55, 56

组合复杂性,259 –61

combinatorial complexity and, 259–61

命题逻辑和269

propositional logic and, 269

监督学习算法,286-87

supervised learning algorithm and, 286–87

思考、学习,293 –95

thinking, learning from, 293–95

进球,41 –42,48 –53,136 –42,165 –69

goals, 41–42, 48–53, 136–42, 165–69

上帝与魔像(维纳),137 – 38

God and Golem (Wiener), 137–38

哥德尔,库尔特,51 , 52

Gödel, Kurt, 51, 52

约翰·沃尔夫冈·冯·歌德,137

Goethe, Johann Wolfgang von, 137

IJ142 –43,153,208 –9

Good, I. J., 142–43, 153, 208–9

古德哈特定律,77

Goodhart’s law, 77

古德曼·尼尔森85岁

Goodman, Nelson, 85

老式人工智能(GOFAI),271

Good Old-Fashioned AI (GOFAI), 271

108,112-13

Google, 108, 112–13

DeepMind(参见 DeepMind

DeepMind (See DeepMind)

主场,64 – 65

Home, 64–65

在 Google Photo 中将人误认为大猩猩,60

misclassifying people as gorillas in Google Photo, 60

张量处理单元 (TPU),35

tensor processing units (TPUs), 35

大猩猩问题,132-36

gorilla problem, 132–36

人工智能治理,249-53

governance of AI, 249–53

政府奖惩制度,106-7

governmental reward and punishment systems, 106–7

大脱钩,117

Great Decoupling, 117

贪婪(作为工具性目标),140-42

greed (as an instrumental goal), 140–42

Grice,H.Paul,205

Grice, H. Paul, 205

格赖斯分析,205

Gricean analysis, 205

停机问题,37 –38

halting problem, 37–38

手工建造问题,机器人,73

hand construction problem, robots, 73

哈丁·加勒特(31岁)

Hardin, Garrett, 31

硬起飞场景,144

hard takeoff scenario, 144

哈罗普(导弹),111

Harop (missile), 111

哈萨尼,约翰220,229

Harsanyi, John, 220, 229

哈萨比斯,德米斯,271 –72, 293

Hassabis, Demis, 271–72, 293

霍金,斯蒂芬4,153

Hawking, Stephen, 4, 153

健康进步,101

health advances, 101

贺建奎,156岁

He Jiankui, 156

赫伯特·弗兰克,135

Herbert, Frank, 135

抽象行动的层次结构,87-90,265-66

hierarchy of abstract actions, 87–90, 265–66

欧盟人工智能高级专家组,251

High-Level Expert Group on Artificial Intelligence (EU), 251

Hillarp,Nils-Åke,17岁

Hillarp, Nils-Åke, 17

欣顿,杰夫,290

Hinton, Geoff, 290

赫希,弗雷德,230

Hirsch, Fred, 230

霍布斯,托马斯,246

Hobbes, Thomas, 246

霍华德庄园(福斯特),254

Howard’s End (Forster), 254

赫芬顿邮报, 4

Huffington Post, 4

人类生殖细胞改变,禁止,155-56

human germline alteration, ban on, 155–56

人机协作,163 –65

human–machine teaming, 163–65

人类偏好,211-45

human preferences, 211–45

行为、学习偏好,190 –92

behavior, learning preferences from, 190–92

有益的人工智能,172-77

beneficial AI and, 172–77

随着时间的推移,240 –45

changes in, over time, 240–45

不同的人,学会在偏好之间做出权衡,213 –27

different people, learning to make trade-offs between preferences of, 213–27

情绪和,232 –34

emotions and, 232–34

错误,236 –37

errors as to, 236–37

体验自我,238-40

of experiencing self, 238–40

异质性,212 –13

heterogeneity of, 212–13

忠诚的人工智能,215 –17

loyal AI, 215–17

修改,243 –45

modification of, 243–45

善良、邪恶和嫉妒的人,227-31

of nice, nasty and envious humans, 227–31

记住自我,238-40

of remembering self, 238–40

愚蠢,232-34

stupidity and, 232–34

传递性,23 –24

transitivity of, 23–24

不确定性,235 –37

uncertainty and, 235–37

更新,241 –42

updates in, 241–42

功利主义人工智能(参见 功利主义/功利主义人工智能

utilitarian AI (See utilitarianism/utilitarian AI)

效用理论,23-27

utility theory and, 23–27

人类角色,接管,124-31

human roles, takeover of, 124–31

人类对人类的利用(维纳),137

Human Use of Human Beings (Wiener), 137

谦逊的人工智能,175-76

humble AI, 175–76

谟,大卫,167,287-88

Hume, David, 167, 287–88

IBM62,80,250

IBM, 62, 80, 250

理想功利主义,219

ideal utilitarianism, 219

IEEE(电气电子工程师协会),250

IEEE (Institute of Electrical and Electronics Engineers), 250

无知,52-53

ignorance, 52–53

模仿游戏,40-41

imitation game, 40–41

初始主义图像,291

inceptionism images, 291

归纳逻辑编程,86

inductive logic programming, 86

归纳推理,287-88

inductive reasoning, 287–88

输入,到智能代理,42 –43

inputs, to intelligent agents, 42–43

本能生物,18-19

instinctive organisms, 18–19

电气电子工程师协会 (IEEE),250

Institute of Electrical and Electronics Engineers (IEEE), 250

工具性目标,141-42,196

instrumental goal, 141–42, 196

保险承保人,119

insurance underwriters, 119

智力,13-61

intelligence, 13–61

动作电位和15

action potentials and, 15

大脑和,16,17-18

brains and, 16, 17–18

电脑和39 – 61

computers and, 39–61

意识,16-17

consciousness and, 16–17

大肠杆菌和,14-15

E. coli and, 14–15

进化起源,14 –18

evolutionary origins of, 14–18

学习,15、18-20

learning and, 15, 18–20

神经网络和16

nerve nets and, 16

实践推理,20

practical reasoning and, 20

理性,20 – 32

rationality and, 20–32

标准模型,9 – 11, 13 , 48 – 61, 247

standard model of, 9–11, 13, 48–61, 247

成功推理,20

successful reasoning and, 20

情报机构,104

intelligence agencies, 104

情报爆炸,142 –44,208 –9

intelligence explosions, 142–44, 208–9

智能代理,42 –48

intelligent agent, 42–48

产生的行动,48

actions generated by, 48

代理程序和48 –59

agent programs and, 48–59

定义,42

defined, 42

设计和问题类型,43-45

design of, and problem types, 43–45

环境和43 , 44 , 45 –46

environment and, 43, 44, 45–46

输入至,42 –43

inputs to, 42–43

多智能体合作设计,94

multi-agent cooperation design, 94

目标和,43、48-61

objectives and, 43, 48–61

反射,57 –59

reflex, 57–59

智能计算机。参见 人工智能 (AI)

intelligent computers. See artificial intelligence (AI)

智能个人助理,67-71,101

intelligent personal assistants, 67–71, 101

常识建模和68 –69

commonsense modeling and, 68–69

设计模板,69 –70

design template for, 69–70

教育系统,70

education systems, 70

卫生系统,69-70

health systems, 69–70

个人财务系统,70

personal finance systems, 70

隐私考虑,70-71

privacy considerations, 70–71

早期系统的缺点,67-68

shortcomings of early systems, 67–68

刺激—反应模板,以及67

stimulus–response templates and, 67

理解内容,改进,68

understanding content, improvements in, 68

国际原子能机构,249

International Atomic Energy Agency, 249

物联网 (IoT),65

Internet of Things (IoT), 65

人际服务是就业的未来,122-24

interpersonal services as the future of employment, 122–24

算法偏见,128-30

algorithmic bias and, 128–30

影响人们的决定,使用机器,126-28

decisions affecting people, use of machines in, 126–28

仿人形机器人,124-26

robots built in humanoid form and, 124–26

棘手问题,38 –39

intractable problems, 38–39

逆向强化学习,191-93

inverse reinforcement learning, 191–93

智商:48

IQ, 48

石黑浩,125

Ishiguro, Hiroshi, 125

是-应问题,167

is-ought problem, 167

“这很复杂”论点,147 –48

“it’s complicated” argument, 147–48

“这是不可能的”论点,149 –50

“it’s impossible” argument, 149–50

“现在担心这个还太早”论点,150 –52

“it’s too soon to worry about it” argument, 150–52

水母,16

jellyfish, 16

危险边缘!(电视节目),80

Jeopardy! (tv show), 80

杰文斯,威廉·斯坦利,222

Jevons, William Stanley, 222

佳佳(机器人),125

JiaJia (robot), 125

简爱, 219

jian ai, 219

卡尼曼,丹尼尔,238-40

Kahneman, Daniel, 238–40

加里·卡斯帕罗夫, 62 , 90 , 261

Kasparov, Garry, 62, 90, 261

柯洁6岁

Ke Jie, 6

凯利,凯文97,148

Kelly, Kevin, 97, 148

肯尼·戴维, 153 , 163

Kenny, David, 153, 163

凯恩斯,约翰·梅纳德,11314,12021,122

Keynes, John Maynard, 113–14, 120–21, 122

迈达斯国王问题,136 –40

King Midas problem, 136–40

Kitkit School(软件系统),70

Kitkit School (software system), 70

知识,79 –82,267 –72

knowledge, 79–82, 267–72

知识系统,50-51

knowledge-based systems, 50–51

克鲁格曼,保罗,117

Krugman, Paul, 117

库兹韦尔,雷,163-64

Kurzweil, Ray, 163–64

语言/常识问题,79 –82

language/common sense problem, 79–82

拉普拉斯,皮埃尔-西蒙,54岁

Laplace, Pierre-Simon, 54

激光干涉引力波天文台(LIGO),82-84

Laser-Interferometer Gravitational-Wave Observatory (LIGO), 82–84

学习,15

learning, 15

行为、学习偏好,190 –92

behavior, learning preferences from, 190–92

引导过程,81 –82

bootstrapping process, 81–82

文化和19

culture and, 19

概念和理论的累积学习,82-87

cumulative learning of concepts and theories, 82–87

数据驱动的观点,82 –83

data-driven view of, 82–83

深度学习6,58-59,84,86-87,288-93

deep learning, 6, 58–59, 84, 86–87, 288–93

作为进化加速器,18-20

as evolutionary accelerator, 18–20

根据经验,285 – 93

from experience, 285–93

基于解释的学习,294-95

explanation-based learning, 294–95

特征工程和84 –85

feature engineering and, 84–85

逆向强化学习,191-93

inverse reinforcement learning, 191–93

强化学习17,47,55-57,105,190-91

reinforcement learning, 17, 47, 55–57, 105, 190–91

监督学习,58-59,285-93

supervised learning, 58–59, 285–93

思考,293 –95

from thinking, 293–95

LeCun, Yann, 47 , 165

LeCun, Yann, 47, 165

法律专业,119

legal profession, 119

致命自主武器系统(LAWS),110-13

lethal autonomous weapons systems (LAWS), 110–13

生命3.0 (泰格马克114,138

Life 3.0 (Tegmark), 114, 138

LIGO(激光干涉引力波天文台),82-84

LIGO (Laser-Interferometer Gravitational-Wave Observatory), 82–84

生活水平提高,AI ,98-100

living standard increases, and AI, 98–100

劳埃德·塞斯37 岁

Lloyd, Seth, 37

威廉·劳埃德(William Lloyd)31岁

Lloyd, William, 31

拉蒙·卢尔(Ramon Llull)40 岁

Llull, Ramon, 40

洛奇,大卫,1

Lodge, David, 1

逻辑, 39 –40, 50 –51, 267 –72

logic, 39–40, 50–51, 267–72

贝叶斯,54

Bayesian, 54

定义,267

defined, 267

一阶,51 –52,270 –72

first-order, 51–52, 270–72

正式语言要求,267

formal language requirement, 267

无知,52-53

ignorance and, 52–53

编程,开发,271

programming, development of, 271

命题(布尔51,268-70

propositional (Boolean), 51, 268–70

前瞻搜索, 47 , 49 –50, 260 –61

lookahead search, 47, 49–50, 260–61

漏洞原则,202-3,216

loophole principle, 202–3, 216

洛夫莱斯,艾达,40岁,132-33

Lovelace, Ada, 40, 132–33

忠诚的人工智能,215 –17

loyal AI, 215–17

卢德主义指控,153 –54

Luddism accusation, 153–54

机器,33

machines, 33

“机器停止” (福斯特), 254 –55

“Machine Stops, The” (Forster), 254–55

机器翻译,6

machine translation, 6

迈克菲,安德鲁,117

McAfee, Andrew, 117

麦卡锡,约翰4-5,50,51,52,53,65,77

McCarthy, John, 4–5, 50, 51, 52, 53, 65, 77

恶意,228-29

malice, 228–29

恶意软件, 253

malware, 253

地图导航,257 –58

map navigation, 257–58

有益人工智能的数学证明,185-90

mathematical proofs for beneficial AI, 185–90

数学,33

mathematics, 33

矩阵,33

matrices, 33

黑客帝国电影),222,235

Matrix, The (film), 222, 235

MavHome 项目,71

MavHome project, 71

机械计算器,40

mechanical calculator, 40

心理安全,107-10

mental security, 107–10

“与机器合并”论点,163-65

“merge with machines” argument, 163–65

元推理,262

metareasoning, 262

伦理学方法》(西奇威克),224-25

Methods of Ethics, The (Sidgwick), 224–25

微软,250

Microsoft, 250

TrueSkill 系统,279

TrueSkill system, 279

约翰·斯图亚特·穆勒,217-218,219

Mill, John Stuart, 217–18, 219

明斯基,马文4 –5,76 ,153

Minsky, Marvin, 4–5, 76, 153

人工智能的滥用,103 –31,253 –54

misuses of AI, 103–31, 253–54

行为矫正,104-7

behavior modification, 104–7

敲诈勒索,104-5

blackmail, 104–5

深度伪造,105-6

deepfakes, 105–6

政府奖惩制度,106-7

governmental reward and punishment systems, 106–7

情报机构和104

intelligence agencies and, 104

人际服务,接管,124 –31

interpersonal services, takeover of, 124–31

致命自主武器系统(LAWS),110-13

lethal autonomous weapons systems (LAWS), 110–13

心理安全,107-10

mental security and, 107–10

工作,消除,113 –24

work, elimination of, 113–24

移动电话,64 – 65

mobile phones, 64–65

单调性,24

monotonicity and, 24

摩尔GE219,221,222

Moore, G. E., 219, 221, 222

摩尔定律,34-35

Moore’s law, 34–35

莫拉维克,汉斯,144

Moravec, Hans, 144

摩根·康威·劳埃德(18岁)

Morgan, Conway Lloyd, 18

摩根斯坦,奥斯卡,23岁

Morgenstern, Oskar, 23

墨子(Mozi),219

Mozi (Mozi), 219

多智能体合作设计,94

multi-agent cooperation design, 94

马斯克,埃隆,153 , 164

Musk, Elon, 153, 164

“超人人工智能的神话” (Kelly),148

“Myth of Superhuman AI, The” (Kelly), 148

狭义工具)人工智能,46、47、136

narrow (tool) artificial intelligence, 46, 47, 136

纳什,约翰30,195

Nash, John, 30, 195

纳什均衡,30 –31,195 –96

Nash equilibrium, 30–31, 195–96

美国国立卫生研究院(NIH),155

National Institutes of Health (NIH), 155

消极利他主义,229-30

negative altruism, 229–30

NELL(永无止境的语言学习)项目,81

NELL (Never-Ending Language Learning) project, 81

神经网络,16

nerve nets, 16

NET-VISA,279 –80

NET-VISA, 279–80

网络执行法(德国)108,109

Network Enforcement Act (Germany), 108, 109

神经尘埃,164-65

neural dust, 164–65

Neuralink 公司,164

Neuralink Corporation, 164

神经织网,164

neural lace, 164

神经网络,288-89

neural networks, 288–89

神经元,15,16,19

neurons, 15, 16, 19

永无止境的语言学习(NELL)项目,81

Never-Ending Language Learning (NELL) project, 81

纽厄尔,艾伦,295

Newell, Allen, 295

艾萨克·牛顿,85-86

Newton, Isaac, 85–86

《纽约客》 88

New Yorker, The, 88

吴,安德鲁,151,152

Ng, Andrew, 151, 152

诺维格,彼得,262-63

Norvig, Peter, 2, 62–63

没有自杀规则,287

no suicide rule, 287

诺齐克,罗伯特,223

Nozick, Robert, 223

工业,157,249

nuclear industry, 157, 249

核物理学,7-8

nuclear physics, 7–8

《助推》(Thaler & Sunstein),244

Nudge (Thaler & Sunstein), 244

目标11 –12,43,48 –61,136 –42,165 –69 。另参阅目标

objectives, 11–12, 43, 48–61, 136–42, 165–69. See also goals

关闭游戏,196 –200

off-switch game, 196–200

十亿(软件系统),70

onebillion (software system), 70

人工智能百年研究(AI100), 149 , 150

One Hundred Year Study on Artificial Intelligence (AI100), 149, 150

OpenAI,56岁

OpenAI, 56

运筹学10,54,176

operations research, 10, 54, 176

Oracle 人工智能系统,161-63

Oracle AI systems, 161–63

正交性论题,167-68

orthogonality thesis, 167–68

奥瓦迪亚,阿维夫,108

Ovadya, Aviv, 108

过度假设,85

overhypothesis, 85

过度智能的人工智能,132-44

overly intelligent AI, 132–44

恐惧和贪婪,140-42

fear and greed, 140–42

大猩猩问题,132-36

gorilla problem, 132–36

情报爆炸,142 –44,208 –9

intelligence explosions and, 142–44, 208–9

迈达斯国王问题,136 –40

King Midas problem, 136–40

回形针游戏,194-96

paperclip game, 194–96

帕菲特,德里克,225

Parfit, Derek, 225

人工智能伙伴关系180,250

Partnership on AI, 180, 250

帕斯卡·布莱斯21-22,40

Pascal, Blaise, 21–22, 40

印度之行,A(福斯特),254

Passage to India, A (Forster), 254

迪亚珍珠,54,275

Pearl, Judea, 54, 275

Perdix(雄蜂),112

Perdix (drone), 112

平克,史蒂文,158 , 165 –66, 168

Pinker, Steven, 158, 165–66, 168

Planet(卫星公司),75

Planet (satellite corporation), 75

政治学(亚里士多德),114

Politics (Aristotle), 114

波普尔,卡尔,221-22

Popper, Karl, 221–22

科普, 152

Popular Science, 152

地位商品,230-31

positional goods, 230–31

实践推理,20

practical reasoning, 20

语用学,204

pragmatics, 204

偏好自主原则,220,241

preference autonomy principle, 220, 241

偏好。参见 人类偏好

preferences. See human preferences

偏好功利主义,220

preference utilitarianism, 220

普莱斯,理查德,54岁

Price, Richard, 54

骄傲,230-31

pride, 230–31

原始阐释者, 133

Primitive Expounder, 133

囚徒困境,30-31

prisoner’s dilemma, 30–31

隐私,70-71

privacy, 70–71

概率论,21-22,273-84

probability theory, 21–22, 273–84

贝叶斯网络和275 –77

Bayesian networks and, 275–77

一阶概率语言,277-80

first-order probabilistic languages, 277–80

独立和274

independence and, 274

跟踪不能直接观察到的现象,280-84

keeping track of not directly observable phenomena, 280–84

概率规划,54-55,84,279-80

probabilistic programming, 54–55, 84, 279–80

编程语言, 34

programming language, 34

程序,33

programs, 33

禁令,202-3

prohibitions, 202–3

阿里斯托项目,80

Project Aristo, 80

Prolog,271

Prolog, 271

有益人工智能的证据

proofs for beneficial AI

辅助游戏,184 – 210,192 203

assistance games, 184–210, 192–203

从行为中学习偏好,190-92

learning preferences from behavior, 190–92

数学保证,185-90

mathematical guarantees, 185–90

递归自我完善,208-10

recursive self-improvement and, 208–10

请求和指示,解释,203-5

requests and instructions, interpretation of, 203–5

电线问题,205 –8

wireheading problem and, 205–8

命题逻辑,51,268-70

propositional logic, 51, 268–70

普京,弗拉基米尔182,183

Putin, Vladimir, 182, 183

“放进盒子里”论点,161-63

“put it in a box” argument, 161–63

谜题,45

puzzles, 45

量子计算,35-36

quantum computation, 35–36

量子比特设备,35 –36

qubit devices, 35–36

随机策略,29

randomized strategy, 29

理性

rationality

亚里士多德的表述,20-21

Aristotle’s formulation of, 20–21

贝叶斯,54

Bayesian, 54

批评,24-26

critiques of, 24–26

预期价值规则,22 –23

expected value rule and, 22–23

赌博和,21-23

gambling and, 21–23

博弈论和28 – 32

game theory and, 28–32

人类偏好的不一致以及有益人工智能理论的发展,26-27

inconsistency in human preferences, and developing theory of beneficial AI, 26–27

逻辑和,39 –40

logic and, 39–40

单调性,24

monotonicity and, 24

纳什均衡,30-31

Nash equilibrium and, 30–31

偏好和23 –27

preferences and, 23–27

概率和,21 –22

probability and, 21–22

随机策略,29

randomized strategy and, 29

对于单个代理,20 – 27

for single agent, 20–27

及物性,23 –24

transitivity and, 23–24

两名特工,27 – 32

for two agents, 27–32

不确定性,21

uncertainty and, 21

效用理论,22 – 26

utility theory and, 22–26

理性元推理,262

rational metareasoning, 262

阅读能力,74 –75

reading capabilities, 74–75

现实世界的决策问题

real-world decision problem

复杂性,39

complexity and, 39

理由与人(帕菲特),225

Reasons and Persons (Parfit), 225

重组 DNA 咨询委员会,155

Recombinant DNA Advisory Committee, 155

重组DNA研究,155-56

recombinant DNA research, 155–56

递归自我完善,208-10

recursive self-improvement, 208–10

红线, 128

redlining, 128

反射剂,57 –59

reflex agents, 57–59

强化学习17,47,55-57,105,190-91

reinforcement learning, 17, 47, 55–57, 105, 190–91

记住自己和偏好,238-40

remembering self, and preferences, 238–40

令人厌恶的结论,225

Repugnant Conclusion, 225

声誉系统,108-9

reputation systems, 108–9

“研究无法控制”论点,154-56

“research can’t be controlled” arguments, 154–56

零售收银员,117-18

retail cashiers, 117–18

奖励函数,53-54,55

reward function, 53–54, 55

奖励制度,17

reward system, 17

机器人的崛起:科技与失业未来的威胁(福特),113

Rise of the Robots: Technology and the Threat of a Jobless Future (Ford), 113

人工智能带来的风险,145-70

risk posed by AI, 145–70

偏转论证,154-59

deflection arguments, 154–59

否认问题,146-54

denial of problem, 146–54

罗宾逊,艾伦,5岁

Robinson, Alan, 5

罗切斯特,纳撒尼尔,4-5

Rochester, Nathaniel, 4–5

卢瑟福,欧内斯特,7 , 77 , 85 – 86, 150

Rutherford, Ernest, 7, 77, 85–86, 150

萨克斯,杰弗里,230

Sachs, Jeffrey, 230

虐待狂,228-29

sadism, 228–29

萨洛蒙斯,安娜,116

Salomons, Anna, 116

塞缪尔,亚瑟,5,10,55,261

Samuel, Arthur, 5, 10, 55, 261

萨金特,汤姆,191

Sargent, Tom, 191

可扩展自主武器,112

scalable autonomous weapons, 112

施瓦布,克劳斯,117

Schwab, Klaus, 117

第二次机器时代(Brynjolfsson & McAfee),117

Second Machine Age, The (Brynjolfsson & McAfee), 117

李世石, 6 , 47 , 90 , 91 , 261

Sedol, Lee, 6, 47, 90, 91, 261

地震监测系统(NET-VISA),279 –80

seismic monitoring system (NET-VISA), 279–80

自动驾驶汽车,65 –67, 181 –82, 247

self-driving cars, 65–67, 181–82, 247

性能要求,65 –66

performance requirements for, 65–66

潜在效益,66 –67

potential benefits of, 66–67

概率规划和281 –82

probabilistic programming and, 281–82

全球尺度感知,75

sensing on global scale, 75

套数,33

sets, 33

Shakey 项目,52

Shakey project, 52

克劳德·香农,45,62

Shannon, Claude, 4–5, 62

席勒,罗伯特,117

Shiller, Robert, 117

侧信道攻击187,188

side-channel attacks, 187, 188

西奇威克·亨利( Henry Sidgwick),224-25

Sidgwick, Henry, 224–25

对人工智能风险的沉默,158-59

silence regarding risks of AI, 158–59

西蒙·赫伯特,76 , 86 , 265

Simon, Herbert, 76, 86, 265

程序的模拟演化,171

simulated evolution of programs, 171

SLAM(同步定位与地图构建),283

SLAM (simultaneous localization and mapping), 283

Slate Star Codex博客,146、169-70

Slate Star Codex blog, 146, 169–70

屠杀机器人,111

Slaughterbot, 111

小世界(小屋)1

Small World (Lodge), 1

智能,RN ,221-22

Smart, R. N., 221–22

智能家居,71-72

smart homes, 71–72

史密斯,亚当,227

Smith, Adam, 227

snopes.com,108

snopes.com, 108

社会聚集定理,220-21

social aggregation theorem, 220–21

社会增长的极限(Hirsch),230

Social Limits to Growth, The (Hirsch), 230

社交媒体和内容选择算法,8-9

social media, and content selection algorithms, 8–9

软机器人,64

softbots, 64

软件系统,248

software systems, 248

解决方案,寻找,257 –66

solutions, searching for, 257–66

摘要规划和,264 – 66

abstract planning and, 264–66

组合复杂性,258

combinatorial complexity and, 258

计算活动,管理,261-62

computational activity, managing, 261–62

15 拼图和258

15-puzzle and, 258

去吧,259 –61

Go and, 259–61

地图导航和,257 –58

map navigation and, 257–58

电机控制命令和263 –64

motor control commands and, 263–64

24 拼图和258

24-puzzle and, 258

“自动化的一些道德和技术后果”(维纳),10

“Some Moral and Technical Consequences of Automation” (Wiener), 10

索菲亚(机器人),126

Sophia (robot), 126

(程序的)规范,248

specifications (of programs), 248

“关于第一台超智能机器的推测”(Good),142-43

“Speculations Concerning the First Ultraintelligent Machine” (Good), 142–43

语音识别,6

speech recognition, 6

语音识别功能,74 –75

speech recognition capabilities, 74–75

斯彭斯,迈克,117

Spence, Mike, 117

SpotMini,73

SpotMini, 73

斯里兰卡41-42,52

SRI, 41–42, 52

标准智力模型9-11,13,48-61,247

standard model of intelligence, 9–11, 13, 48–61, 247

星际争霸,45

StarCraft, 45

斯塔西,103-4

Stasi, 103–4

平稳性,24

stationarity, 24

统计数据,10,176

statistics, 10, 176

索尔·斯坦伯格88 岁

Steinberg, Saul, 88

刺激—反应模板,67

stimulus–response templates, 67

Stockfish(国际象棋程序),47

Stockfish (chess program), 47

奋斗和享受之间的关系,121 –22

striving and enjoying, relation between, 121–22

子程序, 34 , 233 –34

subroutines, 34, 233–34

萨默斯·拉里(117)120

Summers, Larry, 117, 120

山顶机器,34,35,37

Summit machine, 34, 35, 37

桑斯坦,卡斯,244

Sunstein, Cass, 244

超级智能Bostrom),102,145,150,167,183

Superintelligence (Bostrom), 102, 145, 150, 167, 183

监督学习,58-59,285-93

supervised learning, 58–59, 285–93

监视,104

surveillance, 104

萨瑟兰,詹姆斯,71岁

Sutherland, James, 71

“关掉它”论点,160-61

“switch it off” argument, 160–61

突触15,16

synapses, 15, 16

西拉德,利奥,8 , 77 , 150

Szilard, Leo, 8, 77, 150

触觉感知问题,机器人,73

tactile sensing problem, robots, 73

淘宝,106

Taobao, 106

技术性失业。参见 工作、消除

technological unemployment. See work, elimination of

泰格马克马克斯4,114,138

Tegmark, Max, 4, 114, 138

斯蒂芬妮·特莱克斯(Stephanie Tellex),73岁

Tellex, Stephanie, 73

腾讯250

Tencent, 250

张量处理单元 (TPU),35

tensor processing units (TPUs), 35

终结者电影),112,113

Terminator (film), 112, 113

Tesauro,Gerry,55岁

Tesauro, Gerry, 55

泰勒,理查德,244

Thaler, Richard, 244

有闲阶级论(凡勃伦),230

Theory of the Leisure Class, The (Veblen), 230

思考,快与慢(卡尼曼),238

Thinking, Fast and Slow (Kahneman), 238

思考、学习,293 –95

thinking, learning from, 293–95

桑顿,理查德,133

Thornton, Richard, 133

7,8

Times, 7, 8

工具(狭义)人工智能46,47,136

tool (narrow) artificial intelligence, 46, 47, 136

TPU(张量处理单元),35

TPUs (tensor processing units), 35

公地悲剧,31

tragedy of the commons, 31

《超验骇客》(电影),34,141 – 42

Transcendence (film), 3–4, 141–42

偏好的传递性,23-24

transitivity of preferences, 23–24

《人性论》,A(休谟),167

Treatise of Human Nature, A (Hume), 167

部落主义,150,159-60

tribalism, 150, 159–60

卡车司机,119

truck drivers, 119

TrueSkill 系统,279

TrueSkill system, 279

塔克·阿尔伯特30 岁

Tucker, Albert, 30

图灵,艾伦32、33、37-38、40-41、124-25、134-35、140-41、144、149、153、160-61

Turing, Alan, 32, 33, 37–38, 40–41, 124–25, 134–35, 140–41, 144, 149, 153, 160–61

图灵测试,40-41

Turing test, 40–41

辅导,100 –101

tutoring, 100–101

辅导系统,70

tutoring systems, 70

2001:太空漫游(电影),141

2001: A Space Odyssey (film), 141

Uber57,182

Uber, 57, 182

UBI(全民基本收入),121

UBI (universal basic income), 121

不确定

uncertainty

人工智能对人类偏好的不确定性,原则53,175-76

AI uncertainty as to human preferences, principle of, 53, 175–76

人类对自身偏好的不确定性,235-37

human uncertainty as to own preferences, 235–37

概率论和,273 –84

probability theory and, 273–84

联合国(UN),250

United Nations (UN), 250

全民基本收入(UBI),121

universal basic income (UBI), 121

《世界人权宣言》(1948年),第107页

Universal Declaration of Human Rights (1948), 107

普遍性,32-33

universality, 32–33

通用图灵机, 33 , 40 –41

universal Turing machine, 33, 40–41

不可预测性,29

unpredictability, 29

功利主义人工智能,217-27

utilitarian AI, 217–27

功利主义(密尔),217-18

Utilitarianism ((Mill), 217–18

功利主义/功利人工智能,214

utilitarianism/utilitarian AI, 214

挑战,221 –27

challenges to, 221–27

结果主义人工智能,217-19

consequentialist AI, 217–19

理想功利主义,219

ideal utilitarianism, 219

效用的人际比较,争论,222-24

interpersonal comparison of utilities, debate over, 222–24

多人,最大化效用总和,219 –26

multiple people, maximizing sum of utilities of, 219–26

偏好功利主义,220

preference utilitarianism, 220

社会聚集定理和220

social aggregation theorem and, 220

索马里问题和,226 –27

Somalia problem and, 226–27

不同规模人群的效用比较,争论,224 –25

utility comparison across populations of different sizes, debate over, 224–25

效用函数,53 –54

utility function, 53–54

实用怪物,223-24

utility monster, 223–24

效用理论,22-26

utility theory, 22–26

公理基础,23 –24

axiomatic basis for, 23–24

反对意见,24 – 26

objections to, 24–26

价值对齐,137 –38

value alignment, 137–38

瓦尔迪,摩西,202-3

Vardi, Moshe, 202–3

凡勃伦,托尔斯坦,230

Veblen, Thorstein, 230

视频游戏,45

video games, 45

虚拟现实创作,101

virtual reality authoring, 101

美德伦理学,217

virtue ethics, 217

视觉物体识别,6

visual object recognition, 6

约翰·冯·诺依曼23 岁

von Neumann, John, 23

W3C 可信网络小组,109

W3C Credible Web group, 109

《机器人总动员》(电影),255

WALL-E (film), 255

沃森80岁

Watson, 80

波函数,35-36

wave function, 35–36

“我们是专家”的论调,152 –54

“we’re the experts” argument, 152–54

白领工作,119

white-collar jobs, 119

怀特黑德,阿尔弗雷德·诺斯,88岁

Whitehead, Alfred North, 88

全脑模拟,171

whole-brain emulation, 171

维纳诺伯特10,136-38,153,203

Wiener, Norbert, 10, 136–38, 153, 203

弗兰克·威尔切克,4岁

Wilczek, Frank, 4

安德鲁·怀尔斯185

Wiles, Andrew, 185

线头加工,205 –8

wireheading, 205–8

工作,消除,113 –24

work, elimination of, 113–24

护理专业,122

caring professions and, 122

补偿效应,114-17

compensation effects and, 114–17

历史警告,113-14

historical warnings about, 113–14

收入分配和123

income distribution and, 123

因采用人工智能技术而面临风险的职业,118-20

occupations at risk with adoption of AI technology, 118–20

重新构建教育和研究机构,以关注人类世界,123-24

reworking education and research institutions to focus on human world, 123–24

奋斗和享受之间的关系,121 –22

striving and enjoying, relation between, 121–22

全民基本收入(UBI)提案,以及121

universal basic income (UBI) proposals and, 121

工资停滞和生产率提高,自 1973 年以来,117

wage stagnation and productivity increases, since 1973, 117

“人机团队合作”论点,163

“work in human–machine teams” argument, 163

世界经济论坛,250

World Economic Forum, 250

万维网,64

World Wide Web, 64

公证人同业公会,109

Worshipful Company of Scriveners, 109

扎克伯格,马克,157

Zuckerberg, Mark, 157

关于作者

About the Author

Stuart Russell 是加州大学伯克利分校计算机科学教授,兼 Smith-Zadeh 工程学教授。他曾担任世界经济论坛人工智能和机器人委员会副主席,以及联合国军备控制顾问。他是美国人工智能协会、计算机协会和美国科学促进会的会员。他与 Peter Norvig 合著了权威且广受好评的人工智能教科书《人工智能:一种现代方法》

Stuart Russell is a professor of Computer Science and holder of the Smith-Zadeh Chair in Engineering at the University of California, Berkeley. He has served as the Vice-Chair of the World Economic Forum's Council on AI and Robotics and as an advisor to the United Nations on arms control. He is a Fellow of the American Association for Artificial Intelligence, the Association for Computing Machinery, and the American Association for the Advancement of Science. He is the author (with Peter Norvig) of the definitive and universally acclaimed textbook on AI, Artificial Intelligence: A Modern Approach.

企鹅兰登书屋 Next Reads 徽标



您的下一本阅读清单 是什么?

What’s next on

your reading list?

发现您下一本

精彩的读物!

Discover your next

great read!

获取个性化的图书精选和有关该作者的最新消息。

Get personalized book picks and up-to-date news about this author.

立即注册。

Sign up now.